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A NOD FACTOR BINDING PROTEIN FROM LEGUME ROOTS 

CROSS REFERENCE TO RELATED APPLICATIONS 
This is a continuation in part of USSN 08/907,226, filed August 6, 1997, 
which is incoiporated herein by reference. 



FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT 
This invention was made with Government support under Grant No. 
GM21882, awarded by the National Institutes of Health and imder Grant No. DCB 
9004967, awarded by the National Science Foundation. The Government has certain 
1 5 rights in this invention. 

BACKGROUND OF THE INVENTION 
Usable nitrogen is the major limiting nutrient in crop plant growth. Plants 
derive most of their nutrients including nitrogen from the soil through uptake in the root 
20 system. Although most of the nitrogen in the soil is in the form of ammonium ions which 
is rapidly converted to usable nitrates by bacteria in the soil, the harvesting of plants results 
in a steady decrease of nitrogen from the soil. Unless the soil is augmented with nitrogen- 
containing compounds, the soil becomes depleted of usable nitrogen and only atmospheric 
nitrogen remains. 

25 Legumes, unlike other higher plants, are able through a symbiotic 

relationship with bacteria to utiUze atmospheric nitrogen in the soil. The bacteria, 
Rhizobia^ infect leguminous seedlings and induce nodulation, the end result being the 
presence within the root system of nodules which contain the rhizobial bacteroids. Once 
within the root system, the bacteroids are able to "fix" atmospheric nitrogen into organic 

30 compounds the legumes can use. In exchange for the conversion of atmospheric nitrogen, 
the plants provide the bacteroids with carbon-containing compounds, other nutrients, and a 
protective environment. 
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Although the "fixed" nitrogen is used throughout the plant in the growth 
and development of its organs and tissues, much of the usable nitrogen remains within the 
nodules of the roots. This empirical finding has led to the practice of crop rotation wherein 
a non-leguminous piant, Le., com, is grown and harvested and then the field is sov^ with a 
5 legume, such as alfalfa. After harvest of the legume, the remaining roots are plowed under 
and thus, usable nitrogen is returned to the soil for the sowing of the non-leguminous crop. 

The legumes recognize the rhizobial bacteria through a lectin-caibohydrate 
interaction. Within the root system, the plants contain lectins that bind to specific 
carbohydrates found on the Rhizobium cell wall. This interaction is very specific; vdth 

10 each plant recognizing and being infected by one rhizobial strain. 

In addition to their involvement in recognition of rhizobial bacteria, 
oligosaccharide signaling events play important roles in the regulation of plant 
development, defense, and other interactions of plants with the environment (Ryan, C.A. 
and Farmer, E.E. Annu, Re\K Plant Physiol Plant Mol Bio. 42:651-674 (1991); Cote, F. 

1 5 and Hahn, M.G. Plant Mol Biol 26:1379-141 1 (1994); Denarie, I. et al Annu, Rev. 

Biochem. 65:503-535 (1996)). Although the structures of some of these oligosaccharides 
have been characterized, little is known about the plant receptors for these signals, nor the 
mechanism(s) by which these signals are transduced. 

Previously, a root lectin, NBP46 (formerly called DB46), was isolated fi-om 

20 young Dolichos biflorus root extracts. NBP46 is a 46 kDa protein that was isolated by 

affinity chromatography on hog gastric mucin blood group A + H substance conjugated to 
Sepharose (Quinn, J.M. and Etzler, M.E. Arch, Biochem, Biophys, 258:535-544 (1987)). 

Identification and characterization of protein and the genes that encode 
them is important to modulation of oligosaccharide signahng in plants. For instance, a 

25 transgenic non-leguminous plant containing a factor that allows rhizobial bacteria to infect 
the plant and fix nitrogen would lessen the need for the addition of nitrogen-containing 
fertilizer to soil and preclude the necessity of crop rotation in nitrogen- depleted fields. 
This would lead to higher yields of crop plants in areas of the world where the soil has 
been overplanted and replenishment of the depleted soil with usable nitrogen. The present 

30 invention addresses these and other needs. 
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SUMMARY OF THE INVENTION 

This invention provides for the isolation and cloning of the cDNA of 
NBP46 (SEQ ID NO: 1 ), which encodes NBP46, a Nod factor binding lectin. Nod factors 
are carbohydrates on the surface ofRhizobium which bind to lectins on the surface of 
leguminous plant organs and can initiate nodulation of the root system by the plants. The 
NBP46 gene encodes a polypeptide of between 50 and 560 amino acidls, more preferably 
462 amino acids (SEQ ID N0:2). 

In a preferred embodiment, the NBP46 coding sequence is operably linked 
to a plant specific promoter, more preferably a root specific promoter, such as the NBP46 
promoter (SEQ ID N0:3). 

In another embodiment, an expression cassette comprising the NBP46 gene 
is introduced into a transgenic plant. In a preferred embodiment, the expression of NBP46 
by the transgenic plant confers to the plant the ability to bind to rhizobial bacteria and 
utilize atmospheric nitrogen. In a particularly preferred embodiment, the expression of 
NBP46 confers to the plant the ability to catalyze the hydrolysis of the phosphoanhydride 
bonds of di- and tri-phosphates, leading to greater availability of nutrients to the plant. 

In a further embodiment of the instant invention, methods of modulating the 
rhizobial interactions and in the phosphatase activity in plants by the introduction of an 
expression cassette comprising NBP46 are disclosed. 

BRIEF DESCRIPTION OF THE FIGURES 
Figure 1 indicates the inhibition of binding of *^^1-NBP46 to HBG A + 
H-Sepharose®. 

In Figure 1 A, the legend is as follows: HBG A + H (■); human ovarian 
cyst blood group A substance (♦); human ovarian cyst blood group H substance (^); 

de-7\^-acetylated HBG A + H (•). 

In Figure 1 B, the legend is as follows: Bradyrhizobiumjaponicum USDA 
1 10 Nod factor (■); p-O-methyl galactose p(l-3) ;V-acetyl-D-glucosamine (O); methyl 
a-7V-acetyl-D-glucosamine (•); methyl p-7S^-acetyl-D-glucosamine (♦); dimer (a), trimer 
(□), and tetramer (0) of p(l-4) N-acetyl-D-glucosamine. 

Figure 2 shows the effect of carbohydrate ligands on phosphatase activity of 
NBP46, NBP46 (201 ng/ml) was preincubated for 1 hour in the presence of various 
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concentrations of B. japonicum USDAl 10 Nod factor (■), R sp. NGP,234(Ac) Nod factor 
(▼), R sp. NGR234(S) Nod factor (a), R. meliloti Nod factor (•), or cz5-vaccenic acid (♦) 
and then assayed for phosphatase activity using a final concentration of 3 mM Mg-ADP. 

Figure 3 shows inhibition of binding of *"l-NBP46 to chitin. Various 
5 concentrations of mono- and oligosaccharides were combined with 109 ng '^^I-NBP46 and 
250 ^ig of chitin in a total volume of 100 ^1. B Japonicum USDAl 10 Nod factor (■); R, 
sp. NGR234(NGRJ Nod factor (^); R. sp. NGR234(NGRb) Nod factor (V); R, meliloti 
Nod factor (•), A^-acetylglucosamine (□), chitin disaccharide (▼); chitin tetrasaccharide 
(A); chitin pentasaccharide (♦), chitin hexasaccharide (O). 

10 

DETAILED DESCRIPTION OF THE INVENTION 



I. Definitions 

The phrase "isolated nucleic acid moIecule"or "isolated protein" refers to a 
1 5 nucleic acid or protein which is essentially free of other cellular components with which it 
is associated in the natural state. It is preferably in a homogeneous state although it can be 
in either a dr>' or aqueous solution. Purity and homogeneity are typically determined using 
anal\^ical chemistry techniques such as polyacrylamide gel electrophoresis or high 
performance liquid chromatography. A protein which is the predominant species present 
20 in a preparation is substantially purified. In particular, an isolated NBP46 gene is 

separated from open reading frames which flank the gene and encode a protein other than 
NBP46. The term "purified" denotes that a nucleic acid or protein gives rise to essentially 
one band in an electrophoretic gel. Particularly, it means that the nucleic acid or protein is 
at least 85% pure, more preferably at least 95% pure, and most preferably at least 99% 
25 pure. 

A "promoter" is defined as an array of nucleic acid control sequences that 
direct transcription of an operably linked nucleic acid. As used herein, a "plant promoter" 
is a promoter that fimctioris in plants. Promoters include necessary nucleic acid sequences 
near the start site of transcription, such as, in the case of a polymerase n type promoter, a 
30 TATA element. A promoter also optionally includes distal enhancer or repressor elements, 
which can be located as much as several thousand base pairs from the start site of 
transcription. A "constitutive" promoter is a promoter that is active under most 
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environmental and developmental conditions. An "inducible" promoter is a promoter that 
is active under environmental or developmental regulation. The term "operably linked" 
refers to a functional linkage betv^een a nucleic acid expression control sequence (such as a 
promoter, or array of transcription factor binding sites) and a second nucleic acid sequence, 
5 wherein the expression control sequence directs transcription of the nucleic acid 
corresponding to the second sequence. 

The term "plant" includes whole plants, plant organs (e.g., leaves, stems, 
flowers, roots, etc.), seeds and plant cells and progeny of same. The class of plants which 
can be used in the method of the invention is generally as broad as the class of higher 

10 plants amenable to transformation techniques, including angiosperms (monocotyledonous 
and dicotyledonous plants), as well as gymnosperms. It includes plants of a variety of 
ploidy levels, includingpolx'ploid, diploid, haploid and hemizygous. 

A pol>Tiucleotide sequence is "heterologous to" an organism or a second 
polynucleotide sequence if it originates from a foreign species, or, if from the same 

1 5 species, is modified from its original form. For example, a promoter operably linked to a 
heterologous coding sequence refers to a coding sequence from a species different from 
that from which the promoter was derived, or, if from the same species, a coding sequence 
which is different from any naturally occurring allelic variants. 

A polynucleotide "exogenous to" an individual plant is a polynucleotide 

20 which is introduced into the plant by any means other than by a sexual cross. Examples of 
means by which this can be accomplished are described below, and include 
Agrobacterium-mcdidLied transformation, bioHstic methods, electroporation, and the like. 
Such a plant containing the exogenous nucleic acid is referred to here as an R, generation 
transgenic plant. Transgenic plants which arise from sexual cross or by selfmg are 

25 descendants of such a plant. 

The phrase "rhizobial binding" refers to the binding between rhizobial 
bacteria and plant cells. Typically, enhanced binding leads to infection by rhizobial 
bacteria of the roots of plants. This in turn leads to nodule formation in the roots. For 
example, a non-leguminous transgenic plant comprising a polynucleotide of this invention 

30 and expressing its corresponding polypeptide in the roots of the plant would bind to Nod 
factors of rhizobial bacteria allowing the plant to become infected by the rhizobial bacteria 
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and allowing the plant to reduce the atmospheric nitrogen contained in the soil and using it 
as a nutrient. 

The phrase "operably linked" refers to a functional linkage between a 
promoter and a second sequence, wherein the promoter sequence initiates transcription of 
RNA corresponding to the second sequence. 

The tenn "polynucleotide " "polynucleotide sequence" or "nucleic acid 
sequence" refers to deoxyribonucleotides or ribonucleotides and polymers thereof in either 
single- or double-stranded form. Unless specifically limited, the term encompasses nucleic 
acids containing known analogs of natural nucleotides which have similar binding 
properties as the reference nucleic acid and are metabolized in a manner similar to 
naturally occurring nucleotides. Unless otherwise indicated, a particular NBP46 nucleic 
acid sequence of this invention also implicitly encompasses conservatively modified 
variants thereof (e.g., degenerate codon substitutions) and complementary sequences and 
as well as the sequence explicitly indicated. Specifically, degenerate codon substitutions 
may be achieved by generating sequences in which the third position of one or more 
selected (or all) codons is substituted with mixed-base and/or deoxyinosine residues 
(Batzer ei al. Nucleic Acid Res. 19:5081 (1991); Ohtsuka et aL,J, Biol. Chem, 260:2605- 
2608 (1985); and Cassol et aL, 1992; Rossolini et aL, Mol Cell Probes 8:91-98 (1994)). 
The term nucleic acid is used interchangeably with gene, cDNA, and mRNA encoded by a 
gene. 

A *'NBP46 polynucleotide" is a nucleic acid sequence comprising (or 
consisting oO a coding region of about 100 to about 2000 nucleotides, sometimes fi-om 
about 1400 to about 1500 nucleotides, which hybridizes tg SEQ ID NO:l under stringent 
conditions (as defined below), or which encodes a NBP46 polypeptide. 

The term "sexual reproduction" refers to the fusion of gametes to produce 
seed by pollination. A "sexual cross" is pollination of one plant by another. "Selfing" is 
the production of seed by self-poUinization, /.e, pollen and ovule are fi-om the same plant. 

In the case of both expression of transgenes and inhibition of endogenous 
genes (e.g., by antisense, or sense suppression) one of skill will recognize that the inserted 
polynucleotide sequence need not be identical, but may be only "substantially identical" to 
a sequence of the gene from which it was derived. As explained below, these substantially 
identical variants are specifically covered by the term NBP46 nucleic acid. 
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In the case where the inserted polynucleotide sequence is transcribed and 
translated to produce a functional polypeptide, one of skill will recognize that because of 
codon degeneracy a number of polynucleotide sequences will encode the same 
polypeptide. These variants are specifically covered by the terms '"NBP46 nucleic acid". 
5 In addition, the term specifically includes those sequences substantially identical 

(determined as described below) with an NBP46 polynucleotide sequence disclosed here 
and that encode polypeptides that are either mutants of wild type NBP46 polypeptides or 
retain the function of the NBP46 polypeptide {e.g., resulting from conservative 
substitutions of amino acids in the NBP46 polypeptide). In addition, variants can be those 

1 0 that encode dominant negative mutants as described below. 

Two nucleic acid sequences or polypeptides are said to be "identical" if the 
sequence of nucleotides or amino acid residues, respectively, in the two sequences is the 
same when aligned for maximimi correspondence as described below. The terms 
"identical" or percent "identity," in the context of tv^'o or more nucleic acids or polypeptide 

1 5 sequences, refer to two or more sequences or subsequences that are the same or have a 
specified percentage of amino acid residues or nucleotides that are the same, when 
compared and aligned for maximmn correspondence over a comparison window, as 
measured using one of the following sequence comparison algorithms or by manual 
alignment and \asual inspection. When percentage of sequence identity is used in 

20 reference to proteins or peptides, it is recognized that residue positions that are not 

identical often differ by conservative amino acid substitutions, where amino acids residues 
are substituted for other amino acid residues with similar chemical properties (e.g., charge 
or hydrophobicity) and therefore do not change the functional properties of the molecule. 
Where sequences differ in conservative substitutions, the percent sequence identity may be 

25 adjusted upwards to correct for the conservative nature of the substitution. Means for 

making this adjustment are well known to those of skill in the art. Typically this involves 
scoring a conservative substitution as a partial rather than a full mismatch, thereby 
increasing the percentage sequence identity. Thus, for example, where an identical amino 
acid is given a score of 1 and a non-conservative substitution is given a score of zero, a 

30 conservative substitution is given a score between zero and 1. The scoring of conservative 
substitutions is calculated according to, e.g., the algorithm of Meyers & Miller, Computer 
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Applic, Biol. ScL 4:1 M 7 (1988) e.g., as implemented in the program PC/GENE 
(Intelligenetics, Mountain View, California, USA).. 

The phrase "substantially identical," in the context of two nucleic acids or 
polypeptides, refers' to sequences or subsequences that have at least 60%, preferably 80%, 
most preferably 90-95% nucleotide or amino acid residue identity when aligned for 
maximum correspondence over a comparison window as measured using one of the 
following sequence comparison algorithms or by manual alignment and visual inspection. 
This definition also refers to the complement of a test sequence, which has substantial 
sequence or subsequence complementarity when the test sequence has substantial identity 
to a reference sequence. 

For sequence comparison, typically one sequence acts as a reference 
sequence, to which test sequences are compared. When using a sequence comparison 
algorithm, test and reference sequences are entered into a computer, subsequence 
coordinates are designated, if necessary, and sequence algorithm program parameters are 
designated. Default program parameters can be used, or alternative parameters can be 
designated. The sequence comparison algorithm then calculates the percent sequence 
identities for the test sequences relative to the reference sequence, based on the program 
parameters. 

A "comparison window", as used herein, includes reference to a segment of 
any one of the number of contiguous positions selected from the group consisting of from 
20 to 600, usually about 50 to about 200, more usually about 100 to about 150 in which a 
sequence may be compared to a reference sequence of the same number of contiguous 
positions after the two sequences are optimally aligned. Methods of alignment of 
sequences for comparison are well-known in the art. Optimal alignment of sequences for 
comparison can be conducted, e.g., by the local homology algorithm of Smith & 
Waterman, Adv, AppL Math. 2:482 (1981), by the homology alignment algorithm of 
Needleman & Wunsch, 7. MoL Biol. 48:443 (1970), by the search for similarity method of 
Pearson & Lipman, Proc. Nat 7. Acad, Sci. USA 85:2444 (1988), by computerized 
implementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the 
Wisconsin Genetics Software Package, Genetics Computer Group, 575 Science Dr., 
Madison, M), or by manual alignment and visual inspection. 
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One example of a useful algorithm is PILEUP. PILEUP creates a multiple 
sequence alignment from a group of related sequences using progressive, pairwise 
alignments to show relationship and percent sequence identity. It also plots a tree or 
dendogram showing the clustering relationships used to create the alignment. PILEUP 
5 uses a simplification of the progressive alignment method of Feng & Doolittle, 7. MoL 

EvoL 35:351-360 (1987). The method used is similar to the method described by Higgins 
& Sharp, CABIOS 5:151-153 (1989). The program can align up to 300 sequences, each of 
a maximum length of 5,000 nucleotides or amino acids. The multiple alignment procedure 
begins with the pairwise alignment of the two most similar sequences, producing a cluster 

10 of two aligned sequences. This cluster is then aligned to the next most related sequence or 
cluster of aligned sequences. Two clusters of sequences are aligned by a simple extension 
of the pain\'ise alignment of tu^o individual sequences. The final alignment is achieved by 
a series of progressive, pairwise alignments. The program is run by designating specific 
sequences and their amino acid or nucleotide coordinates for regions of sequence 

1 5 comparison and by designating the program parameters. For example, a reference 
sequence can be compared to other test sequences to determine the percent sequence 
identity relationship using the following parameters: default gap weight (3.00), default gap 
length weight (0.10), and weighted end gaps. 

Another example of algorithm that is suitable for determining percent 

20 sequence identity and sequence similarity is the BLAST algorithm, which is described in 
Altschul et aL 1 MoL BioL 215:403-410 (1990). Software for performing BLAST 
analyses is publicly available through the National Center for Biotechnology Infomiation 
(http://wwv^'.ncbi.nlm.nih.gov/). This algorithm involves first identifying high scoring 
sequence pairs (HSPs) by identifying short words of length W in the query sequence, 

25 which either match or satisfy some positive-valued threshold score T when aligned with a 
word of the same length in a database sequence. T is referred to as the neighborhood word 
score threshold (Altschul et al, supra). These initial neighborhood word hits act as seeds 
for initiating searches to find longer HSPs containing them. The word hits are extended in 
both directions along each sequence for as far as the cumulative alignment score can be 

30 increased. Extension of the word hits in each direction are halted when: the cumulative 
alignment score falls off by the quantity X from its maximum achieved value; the 
cumulative score goes to zero or below, due to the accumulation of one or more negative- 
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scoring residue alignments; or the end of either sequence is reached. The BLAST 
algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. 
The BLAST program uses as defaults a wordlength (W) of 1 1, the BLOSUM62 scoring 
matrix {see Henikoff & HenikofF, Proc. Nail. Acad. Sci. USA 89:10915 (1989)) alignments 
5 (B) of 50, expectation (E) of 10, M=5, N==-4, and a comparison of both strands. 

The BLAST algorithm also performs a statistical analysis of the similarity 
between two sequences {see, e.g., Karlin & Altschul, Proc. Natl Acad. Sci. USA 
90:5873-5787 (1993)). One measure of similarity provided by the BLAST algorithm is the 
smallest sum probability (P(N)), which provides an indication of the probability by which 

10 a match between two nucleotide or amino acid sequences would occur by chance. For 
example, a nucleic acid is considered similar to a reference sequence if the smallest sum 
probability in a comparison of the test nucleic acid to the reference nucleic acid is less than 
about 0.2, more preferably less than about 0.01, and most preferably less than about 0.001. 

"Conserx'atively modified variants" applies to both amino acid and nucleic 

1 5 acid sequences. With respect to particular nucleic acid sequences, conservatively modified 
\'ariants refers to those nucleic acids which encode identical or essentially identical amino 
acid sequences, or where the nucleic acid does not encode an amino acid sequence, to 
essentially identical sequences. Because of the degeneracy of the genetic code, a large 
number of functionally identical nucleic acids encode any given protein. For instance, the 

20 codons GCA, GCC, GCG and GCU all encode the amino acid alanine. Thus, at every 
position where an alanine is specified by a codon, the codon can be altered to any of the 
corresponding codons described without altering the encoded polypeptide. Such nucleic 
acid variations are "silent variations," which are one species of conservatively modified 
variations. Every nucleic acid sequence herein which encodes a polypeptide also describes 

25 every possible silent variation of the nucleic acid. One of skill wiU recognize that each 
codon in a nucleic acid (except AUG, which is ordinarily the only codon for methionine) 
can be modified to yield a functionally identical molecule. Accordingly, each silent 
variation of a nucleic acid which encodes a polypeptide is implicit in each described 
sequence. 

30 As to amino acid sequences, one of skill will recognize that individual 

substitutions, deletions or additions to a nucleic acid, peptide, polypeptide, or protein 
sequence which alters, adds or deletes a single amino acid or a small percentage of amino 
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acids in the encoded sequence is a "conservatively modified variant" w^here the alteration 
results in the substitution of an amino acid with a chemically similar amino acid. 
Conservative substitution tables providing functionally similar amino acids are v^^ell known 
in the ait. 

5 The following six groups each contain amino acids that are conservative 

substitutions for one another: 

1 ) Alanine (A), Serine (S), Threonine (T); 

2) Aspartic acid (D), Glutamic acid (E); 

3) Asparagine (N), Glutamine (Q); 
1 0 4) Arginine (R), Lysine (K); 

5) Isoleucine (I), Leucine (L), Methionine (M), Valine (V); and 

6) Phenylalanine (F), Tyrosine (Y), Tryptophan (W). 
(see, e.g.y Creighlon, Proteins (1984)). 

1 5 An indication that two nucleic acid sequences or polypeptides are 

substantially identical is that the polypeptide encoded by the first nucleic acid is 
immunologically cross reactive with the antibodies raised against the polypeptide encoded 
by the second nucleic acid. Thus, a polypeptide is typically substantially identical to a 
second polypeptide, for example, where the two peptides differ only by conservative 

20 substitutions. Another indication that two nucleic acid sequences are substantially 

identical is that the two molecules or their complements hybridize to each other under 
stringent conditions, as described below. 

The phrase "selectively (or specifically) hybridizes to" refers to the binding, 
duplexing, or hybridizing of a molecule only to a particular nucleotide sequence imder 

25 stringent hybridization conditions when that sequence is present in a complex mixture 
(e.g., total cellular or library DNA or RNA). 

The phrase "stringent hybridization conditions" refers to conditions imder 
which a probe will hybridize to its target subsequence, typically in a complex mixture of 
nucleic acid, but to no other sequences. Stringent conditions are sequence-dependent and 

30 will be different in different circumstances. Longer sequences hybridize specifically at 
higher temperatures. An extensive guide to the hybridization of nucleic acids is found in 
Tijssen, Techniques in Biochemistry and Molecular Biology-Hybridization with Nucleic 
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Probes, "Overview of principles of hybridization and the strategy of nucleic acid assays" 
(1993). Generally, highly stringent conditions are selected to be about 5-1 0°C lower than 
the thermal melting point (T„) for the specific sequence at a defined ionic strength pH. 
Low stringency conditions are generally selected to be about 15-30 °C below the T^. The 
5 is the temperature (under defined ionic strength, pH, and nucleic concentration) at 

which 50% of the probes complementary to the target hybridize to the target sequence at 
equilibrium (as the target sequences are present in excess, at T^, 50% of the probes are 
occupied at equilibrium). Stringent conditions will be those in which the salt concentration 
is less than about 1.0 M sodium ion, typically about 0.01 to LO M sodium ion 

1 0 concentration (or other salts) at pH 7.0 to 8.3 and the temperature is at least about 30°C for 
short probes (e.g., 10 to 50 nucleotides) and at least about 60°C for long probes (e.g., 
greater than 50 nucleotides). Stringent conditions may also be achieved uath the addition 
of destabilizing agents such as formamide. For selective or specific hybridization, a 
positive signal is at least two times background, preferably 10 time background 

1 5 hybridization. 

Nucleic acids that do not hybridize to each other imder stringent conditions 
are still substantially identical if the polypeptides which they encode are substantially 
identical. This occurs, for example, when a copy of a nucleic acid is created using the 
maximum codon degeneracy pemiitted by the genetic code. In such cased, the nucleic 

20 acids typically hybridize under moderately stringent hybridization conditions. 

In the present invention, genomic DNA or cDNA comprising NBP46 
nucleic acids of the invention can be identified in standard Southern blots under stringent 
conditions using the nucleic acid sequences disclosed here. For the puiposes of this 
disclosure, suitable stringent conditions for such hybridizations are those which include a 

25 hybridization in a buffer of 40% formamide, 1 M NaCl, 1% SDS at 37°C, and at least one 
wash in 0.2X SSC at a temperature of at least about 50°C, usually about 55 °C to about 
60 °C, for 20 minutes, or equivalent conditions. A positive hybridization is at least twice 
background. Those of ordinary skill will readily recognize that alternative hybridization 
and wash conditions can be utilized to provide conditions of similar stringency. 

30 A fiirther indication that two polynucleotides are substantially identical is if 

the reference sequence, amplified by a pair of oligonucleotide primers, can then be used as 
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a probe under stringent hybridization conditions to isolate the test sequence from a cDNA 
or genomic library, or to identify the test sequence in, e.g., a northern or Southern blot. 

The phrase *transgenic plant" refers to a plant into which heterologous 
polynucleotides have been introduced by any means other than sexual cross or selfing. 
Examples of means by which this can be accomplished are described below, and include 
Agrobacterium-mGdiatod transformation, biolistic methods, electroporation, in planta 
techniques, and the like. Such a plant containing the heterologous polynucleotides is 
referred to here as an R, generation transgenic plant. Transgenic plants may also arise 
from sexual cross or by selfing of transgenic plants into which heterologous 
polynucleotides have been introduced. 

IL iDtroduction 

The present invention provides polynucleotides referred to here as NBP46 
polynucleotides, as exemplified by SEQ ID NO: 1 . Polypeptides encoded by the genes of 
the invention are lectins involved in binding a variety of carbohydrates. In addition, 
polypeptides function as an enzyme, catalyzing the dephosphorylation of nucleotide di- 
and triphosphates. As explained below, the nucleic acid sequences of the invention code 
for a Nod factor binding lectin naturally expressed in the root tissue of leguminous plants. 

The polypeptides of the invention are also involved oligosaccharide 
signaling events that play important roles in the regulation of plant development, defense, 
and other interactions of plants with the environment. Although the structures of some of 
these oligosaccharides have been characterized in the prior art, little is known about the 
plant receptors for these signals, nor the mechanism(s) by which these signals are 
transduced. The results presented below show that polyepetides of the invention serve as 
receptors in oligosaccharide signaling. 

Generally, the nomenclature and the laboratory procedures in recombinant 
DNA technology described below are those well known and commonly employed in the 
art. Standard techniques are used for cloning, DNA and RNA isolation, amplification and 
purification. Generally enzymatic reactions involving DNA ligase, DNA polymerase, 
restriction endonucleases and the like are performed according to the manufacturer's 
specifications. These techniques and various other techniques are generally performed 
according to Sambrook, et al 
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III. Isolation Of Nucleic Acid Sequences From Plants 

The isolation of sequences from the genes of the invention may be 
accomplished by a number of techniques. For instance, oligonucleotide probes based on 
the nucleic acid and peptide sequences disclosed herein can be used to identify the desired 
5 gene in a cDNA or genomic DNA library from a desired legimiinous plant species. To 
construct genomic libraries, large segments of genomic DNA are generated by random 
fragmentation, e,g,, using restriction endonucleases, and are ligated with vector DNA to 
form concalemers that can be packaged into the appropriate vector. To prepare a library of 
tissue-specific cDNAs, mRNA is isolated from tissues and a cDNA library which contains 

1 0 the gene transcripts is prepared from the mRNA. 

The cDNA or genomic library can then be screened using a probe based 
upon the sequence of a cloned gene such as the polynucleotides disclosed here. Probes 
may be used to hybridize with genomic DNA or cDNA sequences to isolate homologous 
genes in the same or different plant species. 

1 5 Alternatively, the nucleic acids of interest can be amplified from nucleic 

acid samples using amplification techniques. For instance, polymerase chain reaction 
(PCR) technology can be used to amplify the sequences of the genes directly from mRNA, 
from cDNA, from genomic libraries or cDNA libraries. PCR and other in vitro 
amplification methods may also be useful, for example, to clone nucleic acid sequences 

20 that code for proteins to be expressed, to make nucleic acids to use as probes for detecting 
the presence of the desired mRNA in samples, for nucleic acid sequencing, or for other 
purposes knov^Ti to those of skill. 

Appropriate primers and probes for identifying NBP46 genes from Dolichos 
biflorus or transgenic plant tissues are generated from comparisons of the sequences 

25 provided herein. For a general overview of PCR see PCR PROTOCOLS: A Guide TO 

Methods and Applications, (Innis, M, Gelfand, D., Sninsky, J. and White, T., eds.). 
Academic Press, San Diego (1990). Appropriate degenerate primers for this invention 
include, for instance: a 5^ PCR primer [5'-TA(T/C)GCNGTNAT(T/C)TT(T/C)GATGC-3'] 
(SEQ ID N0:4) and a 3' PCR primer [5'-AT(A/G)TT(A/G)TA(T/A/G)AT(G/A)CCNGG. 

30 3*] (SEQ ID N0:5) where N denotes all nucleotides. The amplification conditions are 

typically as follows. Reaction components: 10 mM Tris-HCl, pH 8.3, 50 mM potassium 
chloride, 1 .5 mM magnesium chloride, 0.001% gelatin, 200 ^iM dATP, 200 dCTP, 
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200 dGTP, 200 nM dTTP, 0.4 \iM primers, and 100 units per mL Taq polymerase. 
Program: 96°C for 3 min., 30 cycles of 96°C for 45 sec., SOT for 60 sec, 72°C for 60 sec, 
followed by 72**C for 5 min. 

Using the above primers, a partial coding sequence will be obtained. There 
5 are many techniques known to those of skill to determine and isolate the complete coding 
sequence. These methods include using the PGR amplified subsequence to probe a cDNA 
library for longer sequences. 

A preferred method is RACE (Frohman, et. al, Proc. Nat 7. Acad. Sci. USA 
85:8998 (1988)). Briefly, this technique involves using PGR to amplify a DNA sequence 

1 0 using a random 5' primer and a defined 3* primer, e.g., (SEQ ID N0:6) (5' RAGE) or a 

random 3* primer and a defined 5^ primer, e.g., (SEQ ID N0:7) (3' RAGE). The amplified 
sequence is then subcloned into a vector where it is then sequenced using standard 
techniques. Kits to perform RAGE are commercially available {e.g. 5' RAGE System, 
GIBGO BRL, Grand Island, New York, USA). In this manner, the entire NBP46 coding 

1 5 sequence of about 1 600 bp can be obtained (SEQ ID NO: 1). The invention also provides 
genomic sequence of the NBP46 (SEQ ID NO:3). 

Alternatively, primers can be selected and synthesized by those of skill 
fi:-om the cDNA sequence disclosed in SEQ ID NOs:l and 3. 

Polynucleotides may also be synthesized by well-knov^ni techniques as 

20 described in the technical literature. See. e.g., Garruthers, et al. Cold Spring Harbor 
Symp. Quant. Biol. 47:411-418 (1982), and Adams, et aL,l Am. Chem. Soc. 105:661 
(1983). Double stranded DNA firagments may then be obtained either by synthesizing the 
complementary strand and annealing the strands together under appropriate conditions, or 
by adding the complementary strand using DNA polymerase with an appropriate primer 

25 sequence. 

rV. Use Of Nucleic Acids O f The Invention To Modulate Gene FYpression 

The polynucleotides of the invention can be used to enhance expression 
{i.e., increase expression of an endogenous gene or provide NBP46 expression in a plant 
30 that does not normally express NBP46) of genes of the invention and thereby enhance 

infection of transgenic plants by rhizobial bacteria, increase the level of nutrients taken up 
by the plants, and affect the growth and development of transgenic plants. Alternatively, 
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enhanced expression can be used to modulate oligosaccharide signaling in the plant. This 
can be accomplished by the overexpression of NBP46 polypeptides in the tissues of 
transgenic plants. 

The heterologous NBP46 polynucleotides do not have to code for exact 
5 copies of the NBP46 proteins exemplified herein. Modified NBP46 polypeptide chains 
can also be readily designed utilizing various recombinant DNA techniques well known to 
those skilled in the art and described for instance, in Sambrook et aL, supra, 
Hydroxylamine can also be used to introduce single base mutations into the coding region 
of the gene (Sikorski, et al, Meth, Enzymol 194: 302-318 (1991)). For example, the 

10 chains can vary fi-om the naturally occurring sequence at the primary structure level by 
amino acid substitutions, additions, deletions, and the like. These modifications can be 
used in a number of combinations to produce the final modified protein chain. 

Alternatively, the nucleic acid sequences of the invention can be used to 
inhibit expression of an endogenous gene. One of skill will recognize that a number of 

15 methods can be used to inactivate or suppress NBP46 activity or gene expression. The 

control of the expression can be achieved by introducing mutations into the gene or using 
recombinant DNA techniques. These techniques are generally well known to one of skill 
and are discussed briefly below. 

Methods for introducing a genetic mutations into a plant genes are well 

20 knoA^Ti. For instance, seeds or other plant material can be treated with a mutagenic 

chemical substance, according to standard techniques. Such chemical substances include, 
but are not limited to, the following: diethyl sulfate, ethylene imine, ethyl 
methanesulfonate and N-nitroso-N-ethylurea. Alternatively, ionizing radiation firom 
sources such as, for example. X-rays or gamma rays can be used. Desired mutants are 

25 selected by assaying for increased seed mass, oil content and other properties. 

Gene expression can be inactivated using recombinant DNA techniques by 
transforming plant cells with constructs comprising transposons or T-DNA sequences. 
NBP46 mutants prepared by these methods are identified according to standard techniques. 
For instance, mutants can be detected by PGR or by detecting the presence or absence of 

30 NBP46 mRNA, e.g., by Northern blots. Mutants can also be selected by assaying for 
increased seed mass, oil content and other properties. 
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The isolated sequences prepared as described herein, can also be used in a 
number of techniques to suppress endogenous NBP46 gene expression. A number of 
methods can be used to inhibit gene expression in plants. For instance, antisense 
technology can be conveniently used. To accomplish this, a nucleic acid segment from the 
5 desired gene is cloned and operably linked to a promoter such that the antisense strand of 
RNA will be transcribed. The construct is then transfomied into plants and the antisense 
strand of RNA is produced. In plant cells, it has been suggested that antisense RNA 
inhibits gene expression by preventing the accumulation of mRNA which encodes the 
enzyme of interest, see, e.g., Sheehy et al., Proc, Nat Acad. Set USA, 85:8805-8809 

10 (1988), and Hiatt et aL, U.S. Patent No. 4,801,340. 

The nucleic acid segment to be introduced generally will be substantially 
identical to al least a portion of the endogenous NBP46 gene or genes to be repressed. The 
sequence, however, need not be perfectly identical to inhibit expression. The vectors of 
the present invention can be designed such that the inhibitory effect applies to other genes 

1 5 within a family of genes exhibiting homology or substantial homology to the target gene. 

For antisense suppression, the introduced sequence also need not be full 
length relative to either the primary transcription product or fully processed mRNA. 
Generally, higher homology can be used to compensate for the use of a shorter sequence. 
Furthermore, the introduced sequence need not have the same intron or exon pattern, and 

20 homology of non-coding segments may be equally effective. Normally, a sequence of 
between about 30 or 40 nucleotides and about full length nucleotides should be used, 
though a sequence of at least about 100 nucleotides is preferred, a sequence of at least 
about 200 nucleotides is more preferred, and a sequence of about 500 to about 1700 
nucleotides is especially preferred. 

25 Catalytic RNA molecules or ribozymes can also be used to inhibit 

expression of NBP46 genes. It is possible to design ribozymes that specifically pair with 
virtually any target RNA and cleave the phosphodiester backbone at a specific location, 
thereby functionally inactivating the target RNA. In carrying out this cleavage, the 
ribozyme is not itself altered, and is thus capable of recycling and cleaving other 

30 molecules, making it a true enzyme. The inclusion of ribozyme sequences within antisense 
RNAs confers RNA-cleaving activity upon them, thereby increasing the activity of the 
constructs. 
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A number of classes of ribozymes have been identified. One class of 
ribozymes is derived from a number of small circular RNAs which are capable of self- 
cleavage and replication in plants. The RNAs replicate either alone (viroid RNAs) or with 
a helper virus (satellite RNAs). Examples include RNAs from avocado sunblotch viroid 
5 and the satellite RNAs from tobacco ringspot virus, lucerne transient streak virus, velvet 
tobacco mottle virus, solanmn nodiflorum mottle virus and subterranean clover mottle 
virus. The design and use of target RNA-specific ribozymes is described in Haseloff et al. 
Nature, 334:585-591 (1988). 

Another method of suppression is sense cosuppression. Introduction of 

10 nucleic acid configured in the sense orientation has been recently shown to be an effective 
means by which to block the transcription of target genes. For an example of the use of 
this method to modulate expression of endogenous genes see, Napoli et al.. The Plant Cell 
2:279-289 (1990), and U.S. Patents Nos. 5,034,323, 5,231,020, and 5,283,184. 

The suppressive effect may occur where the introduced sequence contains 

1 5 no coding sequence per se, but only intron or xmtranslated sequences homologous to 

sequences present in the primary transcript of the endogenous sequence. The introduced 
sequence generally vdll be substantially identical to the endogenous sequence intended to 
be repressed. This minimal identity will typically be greater than about 65%, but a higher 
identity might exert a more effective repression of expression of the endogenous 

20 sequences. Substantially greater identity of more than about 80% is preferred, though 

about 95% to absolute identity would be most preferred. As with antisense regulation, the 
effect should apply to any other proteins within a similar family of genes exhibiting 
homology or substantial homology. 

For sense suppression, the introduced sequence, needing less than absolute 

25 identity, also need not be full length, relative to either the primary transcription product or 
fully processed mRNA. This may be preferred to avoid concurrent production of some 
plants which are overexpressers. A higher identity in a shorter than full length sequence 
compensates for a longer, less identical sequence. Furthermore, the introduced sequence 
need not have the same intron or exon pattern, and identity of non-coding segments will be 

30 equally effective. Normally, a sequence of the size ranges noted above for antisense 
regulation is used. 
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A. Preparation of Recombinant Vectors 

To use isolated sequences in the above techniques, recombinant DNA 
vectors suitable for transformation of plant cells are prepared. Techniques for 
transforming a wide variety of higher plant species are well known and described in the 
5 technical and scientific literature. See, for example, Weising, et aL, Ann. Rev, Genet. 

22:421-477 (1988). A DNA sequence coding for the desired polypeptide, for example a 
cDNA sequence encoding the full length NBP46 protein, will preferably be combined with 
transcriptional and translational initiation regulatory sequences which will direct the 
transcription of the sequence fi-om the gene in the intended tissues of the transgenic plant, 

10 /. e.. a root-specific promoter. 

Promoters can be identified by analyzing the 5* sequences of a genomic 
clone in which naturally occurring Nod factor binding protein-specific genes, z.e., NBP46, 
can be found. At the 5* end of the coding sequence, nucleotide sequences characteristic of 
promoter sequences can be used to identiiy the promoter. Sequences controlling 

15 eukaryotic gene expression have been extensively studied. For instance, promoter 
sequence elements include the TATA box consensus sequence (TATAAT), which is 
usually 20 to 30 base pairs upstream of the transcription start site. In most instances the 
TATA box is required for accurate transcription initiation. In plants, further upstreanri 
fi-om the TATA box, at positions -80 to -100, there is typically a promoter element with a 

20 series of adenines surrounding the trinucleotide G (or T) N G. J. Messing, et al, in 

Genetic Engineering in Plants, pp. 221-227 (Kosage, Meredith and Hollaender, eds. 
(1983)). 

A number of methods are known to those of skill in the art for identifying 
and characterizing promoter regions in plant genomic DNA {see, e.g., Jordano, et al, 

25 Plant Cell 1:855-866 (1989); Bustos, et al, Plant Ce// 1:839-854 (1989); Green, et al, 
EMBOl 7:4035-4044 (1988); Meier, et al. Plant Ce// 3:309-316 (1991); and Zhang, et 
al. Plant Physiology 110:1069-1079 (1996)). 

In construction of recombinant expression cassettes of the invention, a plant 
promoter fragment may be employed which will direct expression of the gene in all tissues 

30 of a regenerated plant. Such promoters are referred to herein as "constitutive" promoters 
and are active under most environmental conditions and states of development or cell 
differentiation. Examples of constitutive promoters include the cauliflower mosaic virus 
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(CaMV) 35S transcription initiation region, the T- or 2 - promoter derived from T-DNA of 
Agrobacterium tumafaciens, and other transcription initiation regions from various plant 
genes known to those of skill. 



5 polynucleotide of the instant invention in a specific tissue (tissue-specific promoters) or 
may be otherwise under more precise environmental control (inducible promoters). 
Examples of tissue-specific promoters under developmental control include promoters that 
initiate transcription only in certain tissues, such as roots, fiiiit, seeds, or flowers. 
Examples of environmental conditions that may affect transcription by inducible promoters 

10 include anaerobic conditions, elevated temperature, or the presence of hght. 

If proper polypeptide expression is desired, a polyadenylation region at the 
3'-end of the coding region should be included. The polyadenylation region can be derived 
from the natural gene, from a variety of other plant genes, or from T-DNA. 

The vector comprising the sequences (e.g,, promoters or coding regions) 

1 5 from genes of the invention will typically comprise a marker gene which confers a 
selectable phenotype on plant cells. For example, the marker may encode biocide 
resistance, particularly antibiotic resistance, such as resistance to kanamycin, G418, 
bleomycin, hygromycin, or herbicide resistance, such as resistance to chlorosluforon or 
Basta. 



Alternatively, the plant promoter may direct expression of the 



20 



B. Production of Transgenic Plants 



DNA constructs of the invention may be introduced into the genome of a 



30 



25 



desired plant host by a variety of conventional techniques. For example, the DNA 
construct may be introduced directly into the genomic DNA of a plant cell using 
techniques such as electroporation and microinjection of plant cell protoplasts, or the DNA 
constructs can be introduced directly into plant tissue using ballistic methods, such as 
DNA particle bombardment. Alternatively, the DNA constructs may be combined with 
suitable T-DNA flanking regions and introduced into a conventional Agrobacterium 
tumefaciens host vector. The virulence functions of iht Agrobacterium tumefaciens host 
will direct the insertion of the construct and adjacent marker into the plant cell DNA when 
the cell is infected by the bacteria. 
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Microinjection techniques are known in the art and well described in the 
scientific and patent literature. The introduction of DNA constructs using polyethylene 
glycol precipitation is described in Paszkowski, et al, EMBOJ. Ji'.lin-llll (1984). 
Electroporation techniques are described in Fromm, et al, Proc. Nat 7. Acad. Sci, USA 
5 82:5824 (1985). BalHstic transformation techniques are described in Klein, et al, Nature 
327:70-73 (1987). 

Agrobacterium rMffje/ac/e/25-mediated transformation techniques, including 
disarming and use of binary vectors, are well described in the scientific literature. See, for 
example Horsch, et ai. Science 233:496-498 (1 984), and Fraley, et ai, Proc. Nat 7. Acad. 

10 ScL 80:4803 (1983). 

Transformed plant cells which are derived by any of the above 
transformation techniques can be cultured to regenerate a whole plant which possesses the 
transformed genotype and thus the desired phenotype. Such regeneration techniques rely 
on manipulation of certain phytohormones in a tissue culture growth medium, typically 

1 5 relying on a biocide and/or herbicide marker which has been introduced together with the 
desired nucleotide sequences. Plant regeneration from cultured protoplasts is described in 
Evans, et al. Protoplasts Isolation and Culture, Handbook of Plant Cell 
Culture, pp. 124-176, Macmillian Publishing Company, New York (1983); and Binding, 
Regeneration of Plants, Plant Protoplasts, pp. 21-73, CRC Press, Boca Raton 

20 (1985). Regeneration can also be obtained from plant callus, explants, organs, or parts 

thereof Such regeneration techniques are described generally in Klee, et al, Ann, Rev, of 
Plant Phys, 38:467-486 (1987). 

To determine the presence of a reduction or increase of NBP46 activity, a 
variety of assays can be used including enzymatic, inununochemical, electrophoretic 

25 detection assays (either with staining or western blotting), or complex carbohydrate 
binding assays. 

In a preferred embodiment, a competitive solid phase assay is used to 
measure NBP46 activity (Etzler, M.E., Glycoconj, 1 11:395 (1994)). This assay measures 
the ability of various hgands to inhibit the binding of labeled NBP46 protein to pronase- 
30 digested hog gastric mucin blood group A + H substance (HBG A + H) conjugated to 
Sepharose® (C^iinn, J.M. & Etzler, M.E.,.4rcA. Biochem, Biophys. 258:535 (1987)). 
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The nucleic acids of the invention can be used to confer desired traits on 
essentially any plant. Thus, the invention has use over a broad range of plants, including 
species from the genera Asparagus, Avena, Brassica, Citrus, Citrullus, Capsicum, 
Cucurbita, Daucus, Glycine, Hordeum, Lactuca, Lycopersicon, Malus, Manihot, 
Nicotiana, Oryza, Persea, Pisum, Pyrus, Prunus, Raphanus, Secale, Solarium, Sorghum, 
Triticum, Vitis, Vigna, and Zea, 

One of skill will recognize that after the expression cassette is stably 
incorporated in transgenic plants and confirmed to be operable, it can be introduced into 
other plants by sexual crossing. Any of a number of standard breeding techniques can be 
used, depending upon the species to be crossed. 

Effects of gene manipulation can be observed by northern blots of the 
mRNA isolated from the tissues of interest. Typically, if the amount of mRNA has 
increased, it can be assumed that the gene is being expressed at a greater rate than before. 
Other methods of measuring NBP46 expression would be by measuring the rhizobial 
infection of the transgenic plants. Alternatively, the ability of the plant to reduce 
atmospheric nitrogen could be assessed. In addition, levels of KBP46 could be measured 
immunochemically, i.e., ELISA, RIA, EIA and other antibody based assays well known to 
those of skill in the art. 

V. Examples 

The following examples are offered to illustrate, but not to limit the claimed 

invention. 

Example 1: rharacterization and cloning of NBP46 
Carbohydrate hinding and characterization of NBP46 

It has been previously demonstrated that NBP46 is a 46 kDa protein that 
can be isolated from young Dolichos bijlorus root extracts by affinity chromatography on 
hog gastric mucin blood group A + H substance (HBG A + H) conjugated to Sepharose(g) 
(Quinn, J.M. & Etzler, M.E., Arch. Biochem. Biophys. 258:535 (1987)). The monomeric 
nature of NBP46 in solution precluded the use of conventional precipitin or agglutination 
assays in determining the carbohydrate binding specificity of this lectin. Therefore a 
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complex carbohydrate binding assay was employed (Etzler, M.E., Glycoconj, J, 11:395 
(1994)). 

As shown in Figure 1 , various concentrations of blood group substances (A) 
and oligosaccharides (B) were combined with 109 ng *^^I-NBP46 (isolated as described in 
5 Quinn, J.M. & Etzler, M.E., Arch, Biochem, Biophys, 258:535 (1987)) and a pronase digest 
of HBG A + H-Sepharose® (final concentration 1%) in a volume of 100 of 5 mM 
MOPS, pH 7.2, containing 0.025% Tween-20® and 0.01% NaNj. Hog blood group A + H 
substance was isolated from hog gastric mucin (Etzler, M.E., Glyconj. J. 11:395 (1994)) 
and de-^-acetylated as described in Etzler, M.E., et al, Arch, Biochem. Biophys. 141:588 

10 (1970). After incubation at room temperature overnight, binding was measured as 

previously described (Etzler, M.E., Glyconj, J, 11:395 (1994)). Although the binding of 
the NBP46 to this resin was inhibited by free HBG A + H (Figure 1 A), no significant 
inhibition was obtained with up to 50 mM concentrations of any of the monosaccharides 
present in the blood group substance, including A^-acetyl-D-galactosamine and L-fiicose, 

1 5 the immimodominant sugars of the blood type A and H determinants, respectively 

(Watkins, W.M., Science 152: 1 72 (1 966); and Lloyd, K.O., ei al , Proc Nat 7. Acad, ScL 
USA 61:1470 (1968)). Individual human ovarian cyst blood group A and H substances 
(provided by Elvin A. Kabat, Columbia University) were equal to one another in inhibitory 
capacity but much weaker than HBG A-i-H (Figure 1 A). De-A^-acetylation of the blood 

20 type A determinant did not alter the abihty of the HBG A + H to inhibit the binding of 
NBP46 (Figure 1 A). 

These results indicated that the binding of NBP46 to the above blood group 
substances was due to its recognition of some portion of the oligosaccharide chains other 
than the blood type A and H determinants and that its carbohydrate binding site 

25 accommodated more than a simple sugar. The carbohydrate specificity of NBP46 thus 

differs from the blood type A specific seed lectin from Dolichos biflorus, which recognizes 
the a 7V-acetyl-D-galactosamine residues which are at the nonreducing ends of the 
oligosaccharide chains of blood group A substance (Etzler, M.E., et aL, Biochemistry 
9:869(1970)). 

30 A variety of oligosaccharides were tested in an attempt to obtain more 

information on the carbohydrate specificity of NBP46 (Figure IB). The strongest 
inhibition was obtained with the purified Nod factor from Bradyrhizobium japonicum 
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USDA 1 10, a bacterial rhizobial strain that nodulates soybean and can also nodulate 
Dolichos biflorus. The Nod factor was isolated as described in Sanjuan, J., et al, Proc. 
Nat 7 Acad. ScL USA 89:8789 (1 992). The Nod factor from Bradyrhizobium japonicum 
USDA 1 10 is composed of a p 1-4 TV-acetyl-D-glucosamine pentasaccharide backbone, 
5 modified by a 2-0-methyI a-L-fucose on C-6 of the sugar at the reducing end and the 

substitution of the acetyl group on the sugar at the nonreducing end with a C,8;i fatty acyl 
chain (Sanjuan, J., et al, Proc, Nat 7. Acad, ScL USA 89:8789 (1992)). Thus, NBP46 can 
be characterized as a Nod factor binding lectin. 
PhQsphohydrolase Activity of NBP46 

1 0 A search of protein and nucleotide data bases using the NCB 1 BLASTP and ^ 

51^577^ programs (Altschul, S.F., et al, 1 MoL Biol 215:403 (1990)) showed no 
significant similarities between NBP46 to the amino acid or cDNA sequences of any other 
plant or animal lectin yet described. It did, however, show 65.6 and 47.6% amino acid 
identity and 70.7 and 58.7% nucleotide identity with the sequences of a pea nucleotide 

1 5 triphosphatase (Hsieh, H.-L., et al , Plant MoL BioL 30:135(1 996), GenBank Accession 
No. Z32743) and an apyrase isolated from potato tubers (Handa, M., et al^ Biochem, 
Biophys. Res. Comm. 218:916 (1996)). Thus, the pea triphosphatase gene could also be 
used in the methods of the invention. Considerably less, but significant, similarity was 
found with the sequences of several other animal and yeast phosphohydrolases. Of 

20 particular interest in this comparison was the presence in all of these sequences of four 
motifs (designated by the bold letters in SEQ ID NO: 2) identified as conserved regions 
among a variety of plant and animal apyrases (Handa, M., et a/., Biochem. Biophys, Res. 
Comm. 218:916(1996)). 

The sequence similarities found between NBP46 and the above enzymes 

25 prompted the testing of NBP46 for phosphohydrolase activity. The reactions were 

conducted in 300 \iL of 60 mM MOPS, pH 6.8, containing 1 mM MgCl2 in a microtiter 
plate using a muhichannel pipette. At various time points up to 4 nraiutes, 30 or 60 iiL 
aliquots were removed and assayed for inorganic phosphate by a photometric microtiter 
assay (Drueckes, P., et al. Anal Biochem, 230:173 (1995)). Conditions were chosen so 

30 that less than 10% of the total substrate was converted to product, and the initial velocity 
(v) was determined from the above rate measurements. The of NBP46 for Mg-ADP 
was found to be 615 fiM. 
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NBP46 catalyzed the hydrolysis of phosphate from both ATP and ADP 
(Figure 2) but showed no activity with AMP, pyrophosphate or glucose-6-phosphate. It 
also had a broad specificity for nucleotide triphosphates, including GTP, CTP and UTP. 
This substrate specificity has been found to be characteristic of the apyrase category of 
5 phosphohydrolases (EC 3.6.1.5). Preincubation of NBP46 with 10 \ig/mL of HBG A + H 
(which results in 46% inhibition of carbohydrate binding activity) resulted in an increase in 
the of NBP46. No increase in phosphatase activity was observed upon preincubation 
of NBP46 with human blood group H substance at a concentration that shows no inhibition 
in the carbohydrate binding assay described above (Figiire 2). The of NBP46 was 

10 also increased in the presence of low concentrations (1 to 5 micromolar) of Nod factors, 
with lower concentrations required for the Nod factors produced by rhizobia that nodulate 
the plant than for the R. meliloti Nod factor. These results suggest that there is interaction 
between the carbohydrate binding and phosphatase sites of NBP46. 
Isolation and Characterization of NBP46 cPNA and Encoded Protein 

15 Two consensus A^-glycosylation sites are present in the sequence of the 

mature protein at residues 111 and 276. Work in progress in our laboratory has established 
that NBP46 is indeed glycosylated at at least one of these sites. It should be noted, 
however, that we do not yet know whether other posttranslational modifications of this 
protein may occur, such as the COOH-terminal proteolysis that modifies two other lectins 

20 from this plant (Etzler, M.E. Biochemistry 33 :9778-9783 (1 994); Schnell, D.T. et al Arch, 
Biochem. Biophys, 310:229-235 (1994)). A search of protein and nucleotide data bases 
using the NCBl TBLASmand 5Z^57W programs (Altschul, S.F. et al J. Mol Biol 
215:403-410 (1990)) showed no significant similarities of NBP46 to the amino acid or 
cDNA sequences of any other plant or animal lectin yet described. It did, however, show 

25 65.6 and 47.6% amino acid identity and 70.7 and 58.7% nucleotide identity with the 
sequences of a pea nucleotide triphosphatase (Hsieh, H-L. et al Plant Mol Biol 
30:135-147 (1996)) and an apyrase isolated from potato tubers (Handa, M. and Guidotti, 
G. Biochem. Biphys. Res. Comm. 218: 916-923 (1996)), respectively. Considerably less, 
but significant, similarity was also found with the sequences of several other animal and 

30 yeast phosphohydrolases. Of particular interest m this comparison is the presence in all of 
these sequences of four motifs (designated by the boxes in SEQ ID NO: 2) identified as 
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conserved regions among a variety of plant and animal apyrases (Handa, M. and Guidotti, 
G. Biochem, Biphys, Res. Comm. 218: 916-923 (1996)). 

The sequence similarities found among NBP46 and the above enzymes 
prompted us to test NBP46 for phosphohydrolase activity. NBP46 catalyzes the hydrolysis 
5 of phosphate from both ATP and ADP but showed no activity with AMP, pyrophosphate 
or giucose-6-phosphate. The of NBP46 for Mg"^ -ADP is 615 nM. The lectin has a 
broad specificity for nucleotide triphosphates, including GTP, CTP and UTP (data not 
shown). This substrate specificity is characteristic of the apyrase category of 
phosphohydrolases (EC 3.6.1.5). Preincubation of NBP46 with ligands that are recognized 

1 0 by its carbohydrate binding site results in an increase in the of this enzyme. Low 

micromolar concentrations of the above Nod factors stimulate this increase in activity, with 
lower concentrations required for the Nod factors produced by rhizobia that modulate the 
'plant than for the R, meliloti Nod factor (Figure 2). Such an increase in enzyme activity is 
also obtained with low millimolar concentrations of the chitin oligosaccharides and N- 

1 5 acetylglucosamine, but not with N-acetylgalactosamine (data not shown). These results 

suggest that there is interaction between the carbohydrate binding and phosphatase sites of 
NBP46. Whether this interaction represents a direct stimulation of the enzyme activity or 
perhaps a stabilization of the enzyme under the assay conditions remains to be determined. 
NBP46 hinds to chitin and other carbohydrates 

20 NBP46 also binds to chitin, a polymer of P(l-4) linked A^-acetyl-D-glucone 

residues; this binding is saturable with a of 28 nmoles of NBP46/gram of chitin and a 

of48 nM. Using chitin as a solid phase, a competitive binding assay was utilized to 
examine the carbohydrate specificity of this protein (Figure 3). Inhibition of binding was 
obtained with high concentrations of A^-acetyl-D-glocosamine but not with similar 

25 concentrations of 7V-acetyl-D-galactosamine, the C4 epimer of this sugar, nor with other 
common monosaccharides. The chitin disaccharide gave approximately ten-fold better 
inhibition than the monosaccharide, whereas the chitin penta- and hexasaccharides were 
slightly better inhibitors than the disaccharide. No inhibition was obtained with the de-TV- 
acetylated chitin oligosaccharides; however, when tested in the millimolar range of 

30 concentrations, several of these oligosaccharides precipitated the lectin even imder highly 
buffered conditions. Whether this precipitation is specific or nonspecific is under 
investigation. 
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Of all the oligosaccharides tested, the best inhibition was obtained with the 
Nod factor isolated from Bradyrhizobium japonicum USDAl 10 (Figure 3), a rhizobial 
strain that nodulates Dolichos biflorus. The chitolipo-saccharidic Nod factors have been 
identified as the signals produced by rhizobia that initiate the nodulation of legumes 
5 Penarie, I. et al Annu, Rev. Biochem, 65:503-535 (1 996)). The B. japonicum USDAl 10 
Nod factor consists of a chitin pentasaccharide backbone, modified by a 2-0-methyl a- 
L-fucose on C-6 of the sugar at the reducing end and the substitution of the acetyl group on 
the sugar at the nonreducing end with cw-vaccenic acid (Sanjuan, J. et al Proc. Natl Acad. 
Sol USA. 89:8789-8793 (1992); Carlson, R.W. et al J. Biol Chem, 268:18372-18-81 

1 0 (1 993)). The higher relative affinity of NBP46 for the intact Nod factor than for the chitin 
pentasaccharide backbone alone indicates that the modifications of this backbone 
contribute to the recognition of the Nod factor by the lectin. No significant inhibition of 
N^P46 binding to chitin was obtained with cz5-vaccenic acid when tested at concentrations 
up to 1 .2 mM nor with L-fucose at concentrations up to 50 mM. 

1 5 Two Nod factors from Rhizobium sp. NGR234, another strain that nodulates 

Dolichos biflorus, were also able to inhibit the binding of NBP46 to chitin. These Nod 
factors differ from the USDAl 10 10 Nod factor in that they have a sulfate on C-3 
(NodNGR^) or an acetate on C-4 (NodNGRe) of the 2-0-methylfiicose; they are also 
methylated on the amino group and partially carbamoylated at C-3, C-4 or C-6 of the sugar 

20 at the nonreducing end (Price, N.P.J, et al Carbohyd. Res. 289:1 15-136 (1993)). The Nod 
factor from Rhizobium meliloti, a strain that does not modulate Dolichos biflorus, gave the 
weakest inhibition when tested at equivalent concentrations (Figure 3). This Nod factor 
differs from the USDAl 1 0 Nod factor in that it has a chitin tetrasaccharide backbone, 
contains a sulfate instead of a fiicose at the reducing end and is acetylated at C-6 of the 

25 sugar at the nonreducing end (Lerouge, P. Nature 344:781-784 (1990)). 

Although the differences in relative affinity of NBP46 for the above Nod 
factors indicate a small preference of the lectin for Nod factors produced by rhizobia that 
modulate the plant, it must be pointed out that both the B. japonicum USDAl 1 0 and R. sp. 
NGR234 strains are only weak nodulators of Dolichos jiblorus^ and the nodules formed 

30 with the former strain do not fix nitrogen. Nod factors from rhizobial strains that are 
strong nodulators of this plant have not yet been purified or characterized. 
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Antiserum raised against NBP46 inhibits noriiilatmn 

Confoca] immunofluorescence microscopy of whole mounts of 7-day old 
Dolichos biflorns roots that had been fixed prior to staining showed that NBP46 is present 
on the surfaces of the newly emerging and young root hairs. Treatment of young roots of 
5 this plant with antiserum to the lectin inhibited the ability of these roots to be nodulated by 
rhizobia (Table 1). Although it is possible that such inhibition could be due to stearic 
hindrance of adjacent sites, these results, coupled with the above finding that NBP46 is a 
Nod factor binding protein, suggest that this root lectin may play a role in 
rhizobium-legume symbiosis either as a host/strain specific receptor or perhaps as a 

10 second, less stringent receptor postulated for this process (Ardourel, M. et al Plant Cell 
6:1357-1374 (1994)). Previous attempts to implicate lectins in this symbiosis have been 
focused on the legume seed lectins (Diaz, C.L. et al Nature 338:579-581 (1989); Hirsch, 
A.M. et al Symbiosis 19:155-173 (1995)), which have not been reported to bind Nod 
factors. It is also possible that NBP46 may function in the recognition of endogenous 

1 5 Nod-factor like signals that have been proposed to play a role in the regulation of plant 
gro\\^h and organogenesis (Etzler, M.E. Biochemistry 33:9778-9783 (1994)). 



Table 1. Effect of anti-NBP46-serum 
on nodulation of Z). biflorus roots 



Average number of nodules (± S.E.) 


Treatment 


Treated region of root 


Region of root emerged 
after treatment 


Untreated 


3.6±0.5 


2.2±0.2 


Preimmunization serum 


3.4±0.5 


1.6±0.2 


Anti-NBP46-sermn 


0.6±0.2 


1.4±0.2 



25 

The roots of 2 sets of 10 3-day old Dolichos biflorus plants were immersed for 1 hour in 
1/100 dilutions of preimmunization serum or anti-NBP46-serum, washed and transferred 
to growth pouches. A third set of 1 0 pts was put directly in growth pouches. Half of each 
set of plants was inoculated with Bradyrhizobium sp. 24A10. After 3 weeks the number of 
30 nodules in the treated region as well as in the region of root that emerged after treatment 
were recorded. No nodules were observed on the roots that had not been inoculated with 
rhizobia. 
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DISCUSSION 

The low concentrations (10'*^ of Nod factor that have been found to induce 
physiological responses in legumes (Denarie, I. et al Annu. Rev. Biochem, 65:503-535 

(1 996) ) predict that Nod factor receptors have high affinity for their ligands. Indeed, high 
affinity binding sites for Nod factors have been found on particulate fractions from roots of 
the legume, Medicago truncatula (Niebel, A. Mol Plant-Microbe Interact 10:132-134 

(1 997) ). Although the inhibition data show the relative affinities of NBP46 for its Hgands, 
they do not enable the determination of the absolute affinities of this lectin for the Nod 
factors. The concentrations of Nod factors required for the stimulation of increased 
phosphatase activity suggest that the Kj's may be in the high nanomolar to low micromolar 
range. It should be noted, however, that NBP46 is primarily a monomer in solution 
(Quinn, J.M. and Etzler, M.E. Arch. Biochem. Biophys, 258:535-544 (1987)); as 
established with antibodies (Homick, C.L. and Karush, F. Immunocheuu 9:325-340 
(1972)), the multivalence that would occur when this lectin is associated with the cell 
surface would increase its apparent affinity for multivalent ligands such as Nod factor 
micelles or Nod factor on the surface of rhizobia by several orders of magnitude. 

The presence of both carbohydrate binding activity and apyrase activity on 
NBP46 and the apparent interaction of these sites suggest that, upon binding its 
carbohydrate ligand, NBP46 may play a role in activating downstream events either 
directly by signal transduction or indirectly, perhaps by serving as a motor for transport of 
the carbohydrate. In this context, it is of interest that the human CD39 lymphoid cell 
activation antigen, one of the apyrases found to have some sequence similarity to NBP46, 
is thought to be involved in the regulation of B cell adhesion (Kansas, G.S. et al /. 
Immunol 1 46:2235-2244 (1 991)). Although these other apyrases have not been tested for 
lectin activity, it is possible that such dual activities of these proteins may have been 
conserved throughout evolution. 

The unique amino acid sequence, carbohydrate specificity and apyrase 
activity of NBP46 distinguish this lectin from the conventional lectins found in abimdance 
in the seeds of legumes (Sharon, N. and Lis, H. FASEBl 4:3198-3208 (1990)). The 
possibility that other such plant lectin/enzymes exist is suggested by the recent finding of a 
CDNA fromArabidopsis thaliana that encodes a receptor-like serine/threonine kinase as 
well as a legume seed lecrin-Iike domain (Herve, C. et al J. Mol Biol 258:778-788 
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(1996)). NBP46 may thus be on one of many multifimctional carbohydrate binding 
proteins that may function in plant ohgosaccharide signaling events. A variety of 
transgenic experiments are underway to elaborate its role in such processes. 

5 METHODS 

Preparation of NBP46. NBP46 was extracted from the roots of 7-day old Dolichos 
biflorus plants and isolated by affinity chromatography on hog blood group A + H 
-Sepharose as previously described (Quinn, J.M. and Etzler, M.E. Arch, Biochem, 
Biophys, 258:535-544 (1987)), followed by ion exchange chromatography. It was 
10 iodinated using the iodine monochloride procedxire as previously described (Etzler, M.E. 
Glycoconj. J. 1 1:395-399 (1994)), which gave a specific activity of approximately 500 x 
10^ cpm/mg protein. 

Carbohydrate binding assays. Solid phase binding assays were conducted using 
iodinated NBP46 and purified shrimp chitin powder (Sigma Chemical Company, St. 

15 Louis, MO), which was A'^acetylated prior to use with 15 mM acetic anhydride in 0.5 M 
NaHCOj for one hour at room temperature. The assays were conducted in a final volume 
of 100 fil of 1 0 mM MOPS buffer, pH 7.2, containing 0.02% Tween-20 and 0.01% NaNj. 
After incubation at room temperature for two hours, binding was measured as previously 
described (Etzler, M.E. Glycoconj. J. 11:395-399 (1994)). 

20 Bradyrhizobium japonicum USDAl 1 0 Nod factor was isolated as 

previously described (Sanjuan, J. et al Proc, Natl, Acad. Sci, USA, 89:8789-8793 (1992)). 
The Nod factors from Rhizobium meliloti and Rhizobium sp. NGR234 were graciously 
provided by Dr. Jean Denarie, CNRS-INRA, Toulouse, France. Monosaccharides and the 
chitin disaccharide were purchased from Sigma Chemical Co., St. Louis, MO, the other 

25 chitin oligosaccharides were obtained from Seikagaku Corp., Tokyo, Japan. 

Cloning of NBP46 cDNA. Total RNA was isolated (Taylor, B. and Powell, A. Focus 
4:4-6 (1982)) from the roots of 1-day-old D, biflorus plants and reverse transcribed using 
M-NMV reverse transcriptase and random hexanucleotide primers (Tabor S. 
RNA-dependent DNA polymerases. In Current Protocols in Molecular Biology^ F.M. 

30 Ausubel, P.M., et al„ Eds., John Wiley & Sons, Inc., Vol. 1, pp. 3.7.1-3.7.3 (1987)). This 
cDNA was used as a template in a PCR reaction employing Tag polymerase and 
degenerate sense and antisense primers corresponding to amino acids 6-12 and 244-249 in 
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SEQ ID N0:2. The PCR was performed in an automated thermal cycler for 35 cycles of 
94°C for 2 min, 37°C for 2 min, and 72°C for 2 min. The predominant 727 bp fragment 
was isolated on a 1 .2% agarose gel, cloned into the pCRII vector (InVitrogen) and 
sequenced (Sanger, F. et al Proc, Natl Acad, ScL USA 74:5463-5467 (1977)). Gene 
5 specific primers were used in 5* and 3* RACE reactions (Frohman, M.A. Proc, Natl Acad. 
Sci USA 85:8998-9002 (1988)); the products were cloned into the pCRII vector and 
sequenced. The full length (1527 bp) cDNA was assembled by ligating the two RACE 
products together using an int Sad site. The sequences of the overlapping regions of the 
5' and 3' RACE products and the original PCR fragment were identical. 

10 Phosphatase assays. N-BP46 (201 ng/ml) was incubated at 25 °C in the presence of 

various concentrations of substrate in a final volume of 100 jil of 60 mM MOPS, pH 6.8, 
containing 1 mM MgCl2. The reactions were conducted in a microtiter plate using a 
multichannel pipette. At various time points, 30 [i\ aliquots were removed and assayed for 
inorganic phosphate by a photometric microtiter assay (Drueckes, P. et al Anal Biochem, 

1 5 230: 1 73 (1 995)), modified by using four parts ammonium molybdate reagent to one part 
1 0% ascorbate for the reagent mixture. Conditions were chosen so that less than 10% of 
the total substrate was converted to product. 

Immunofluorescence microscopy. Roots from 7-day old Dolichos biflonis plants were 
fixed for 45 minutes at 4°C in 0.01 M phosphate buffer, pH 7.2, containing 0, 15 M NaCl 

20 and 0.3% paraformaldehyde. After washing, the roots were treated for 20 minutes with a 
1/250 dilution of preimmimization serum or antiserum prepared against recombinant 
NBP46. After washing, the roots were treated for 20 minutes with fluorescein- labeled 
goat anti-rabbit IgG (Sigma Chemical Co., St. Louis, MO), washed and examined with a 
Leica TCS NT confocal microscope using a 488 nm laser excitation line and a 560 barrier 

25 filter. Confocal images were reconstmcted with 
Imagespace software. 

Nodulation. Dolichos biflonis seeds were sterilized by shaking for 15 minutes in 70% 
ethanol, followed by 1 5 minutes in 3% hydrogen peroxide. After extensive washing with 
sterile HjO, the seeds were gemiinated and grown in sterile growth pouches. At 3 days, 
30 the roots were inoculated with 100 of B, sp. 24A10 (1 x 10^ cells/ml). The number of 
nodules per root was deteraiined after 3 weeks. Antisemm and preimmunization serum 
used to treat the roots were sterilized by filtration through a 0.45 ^im filter. 
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Example 2: Isolation of NBP46 from ot her species 

NBP46 nucleic acids have also been isolated from Medicago sativa (SEQ 
ID N0:8 and 9) and Lotus japonicus (SEQ ID NO: 10 and 1 1). These nucleic acids were 
obtained by RT-PCR as follows. Messenger RNA was obtained form the roots of both 
species and reverse transcribed using oligo-dT primers. Degenerate PCR primers were 
designed to consented sequences of the D. biflorus NBP46 disclosed here and the Pisum 
sativa nucleotide triphosphatase gene described by Hsieh, H.-L., et al^ Plant MoL Biol 
30:1 35(1996). These were used to generate internal 850 bp fragments from both Medicago 
sativa and Lotus japonicus, cDNA species-specific primers then designed for both 5' and 
3' RACE. Full length clones were obtained using primers designed to the 5' and 3* ends of 
the RACE products. Duplicate clones from each species were obtained in separate PCR 
reactions and sequenced in their entirety in both directions. 

Example 3: Isolation of DBX from D. hiflorus 

A second gene also involved in oligosaccharide signaUng has been isolated 
from D, Biflorus (SEQ ID NO: 12 and 13). 

It is understood that the examples and embodiments described herein are for 
illustrative purposes only and that various modifications or changes in light thereof will be 
suggested to persons skilled in the art and are to be included within the spirit and purview 
of this application and scope of the appended claims. All publications, patents, and patent 
applications cited herein are hereby incorporated by reference for all purposes. 
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W HAT IS CI .AIMED IS: 

1 1 . An isolated nucleic acid molecule comprising a NBP46 

2 polynucleotide sequence, which polynucleotide sequence specifically hybridizes to SEQ 

3 ID NO: 1 under stringent conditions. 



1 2. The isolated nucleic acid molecule of claim 1 , wherein the NBP46 

2 polynucleotide is between about 100 nucleotides and about 1600 nucleotides in length. 

1 3. The isolated nucleic acid molecule of claim 1 , wherein the NBP46 

2 polynucleotide is SEQ ID NO: 1 . 

1 4. The isolated nucleic acid molecule of claim 1, further comprising a 

2 plant promoter operably linked to the NBP46 polynucleotide. 

1 5. The isolated nucleic acid molecule of claim 4, wherein the plant 

2 promoter is a root specific promoter. 

1 6. The isolated nucleic acid molecule of claim 1 , wherein the NBP46 

2 pol>Tiucleotide encodes a NBP46 polypeptide of between about 50 and about 460 amino 

3 acids. 

1 7. The isolated nucleic acid molecule of claim 6, wherein the NBP46 

2 polypeptide has an amino acid sequence as shown in SEQ ID NO:2. 

1 8. An isolated nucleic acid molecule comprising a NBP46 

2 polynucleotide sequence, which polynucleotide sequence encodes a NBP46 polypeptide of 

3 between about 50 and about 210 amino acids. 



1 

2 



9. The isolated nucleic acid molecule of claim 8, wherein the NBP46 
polypeptide has an amino acid sequence as shown in SEQ ID N0:2. 
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1 1 0. A transgenic plant comprising an expression cassette containing a 

2 plant promoter operably linked to a heterologous NBP46 polynucleotide that specifically 

3 hybridizes to SEQ ID NO: 1 under stringent conditions. 

1 11. The transgenic plant of claim 1 0, wherein the plant promoter is from 

2 zNBP46gtxiQ, 

1 12. The transgenic plant of claim 1 1 , wherein the NBP46 gene is as 

2 shown in SEQ ID N0:3. 

1 1 3. The transgenic plant of claim 1 0, wherein the heterologous NBP46 

2 pol>Tiucleotide encodes a NBP46 polypeptide. 

1 14. The transgenic plant of claim 13, wherein the NBP46 polypeptide is 

2 SEQIDN0:2. 

1 15. The transgenic plant of claim 1 0, which is not a legume. 

1 1 6. A method of modulating rhizobial interaction in a plant, the method 

2 comprising introducing into the plant an expression cassette containing a plant promoter 

3 operably linked to a heterologous NBP46 polynucleotide that specifically hybridizes to 

4 SEQ ID NO: 1 under stringent conditions. 

1 1 7. The method of claim 1 6, wherein the heterologous NBP46 

2 polynucleotide is SEQ ID NO: 1. 

1 18. The method of claim 1 6, wherein the plant promoter is from a 

2 NBP46 gtnt. 



1 
2 



1 9. The method of claim 1 6, wherein the heterologous NBP46 
polynucleotide encodes a NBP46 polypeptide. 
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1 20. The method of claim 1 9, wherein the NBP46 polypeptide has an 

2 amino acid sequence as shown in SEQ ID N0:2. 

1 21. The method of claim 1 6, wherein the plant is not a legimie. 

1 22. The method of claim 1 6, wherein the expression cassette is 

2 introduced into the plant through a sexual cross. 

1 23. A method of modulating phosphohydrolase activity in a plant, the 

2 method comprising introducing into the plant an expression cassette containing a plant 

3 promoter operably linked to a heterologous NBP46 polynucleotide that specifically 

4 hybridizes to SEQ ED NO: 1 under stringent conditions. 

1 24. The method of claim 23, wherein the heterologous NBP46 

2 polynucleotide is SEQ ID NO: 1 . 

1 25. The method of claim 23, wherein the plant promoter is firom a 

2 NBP46 gene. 

1 26. The method of claim 23, wherein the heterologous NBP46 

2 polynucleotide encodes a NBP46 polypeptide. 

1 27. The method of claim 26, wherein the NBP46 polypeptide has an 

2 amino acid sequence as shown in SEQ ID N0:2. 

1 28. The method of claim 23, wherein the plant is not a legume. 



1 29. The method of claim 23, wherein the expression cassette is 

2 introduced into the plant through a sexual cross. 
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SEQ ID NO; 1 Complete cDNA sequence oiDB46 
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SEQ ID NO: 3 

Genomic DNA sequence of Dolichos biflorus NBP46 
(Bold/ underlined segments indicate exons) 

CTAGATGTGA AGTGATTTTA ATCTTGCAAC TGGTGTAAAT AAATCATAAT ACAATATCTT 60 

ATCTTAAAAA TAAAATCTTC ATAAAAAATA AATATAATGA TTAAATTATC ATAAATAAAT 120 

AAGTAATTAT TTCCTTACCT AACATGATGG CCAGCTCATA TAATAACATC GCTTCTTGGA 180 

GCATATCAAT GACGAAAACG TGGACGCAAA TTATTGGCCT CGGGGATCTG CTTTCTGCAA 240 

ATACTTGTTT CTCCCGAGAA CCGGATTCTC ATTAATTTCT AGTTGTTCTC GTAAATTGCT 300 

CACTTTATTT TCATTGTAAA GTAAAAATAA TTTTCTACTA AAAACGATAT TCACCATGTT 360 

AGTCACATAC ACATTCAATA ATATTTAAAA TGTTATTTAT TTAATGGGAA GAAGATTTAA 420 

TAATTGGGGT TAGTTCTTAC AATAATACAT ACTCAACAAA ATTTTTCCTC AAATATCACA 480 

CGATAGTAAT ATATTAATCT AATATAATCT CACAAAATCA TCTCCATATT TATATATTTC 540 

ATATAGATGA TGTTATCATG GACGTGGATC TCTGCGACCA TAGCATTTTA CATCTATATA 600 

TAGTGGCAAG AGTGACGATT AGTGCAAACT G AAACGAGTA CTCTTTCAGT GGTGAGGTTC 660 

TGAGAGATTC AGAAATGAAT TQGGTGTGGC CAAAGACAAA GAGCATGAGC TTCCTACTCC 720 

TCATCACTTT TCTACTCTTC TCATTGCCAA AACTTTCTTC TTCGCAATAT GTTGGGAACA 780 

GTATCTTACT AAATCATCGT AAGATACTTC CCAACCAGGA ACTCCTTACC TCTTACGCTG 840 

TCATCTTTGA TGCTGGTAGC TCTGGGAGTC GTGTCCATGT CTTCAATTTT GACCAGAACT 900 

TAGATCTCCT GCACATTGGC AATGACCTCG AQTTTACAAA AAAGG TCAAA CTGAAACCTT 960 

AAATTATTCA TTATTATTTT CTTCATCTTA CTCTTACATT CTTCTTCATT ATTCTGGTGC 1020 

A GATCAAACC CGGTTTGAGC TCATACGCTG ATAAGCCTGA AAAAGCTGCA GAATCTCTCA 1080 

TTCCACTTTT GGAGGAAGCT GAAGATGTTG TCCCTGAGGA ACTGCACCCC AAGACACCCC 1140 

TTAAGCTTGG GG TGAGTATT TCTCATCTCT ACTTTTGCCA CAGATTAATA TGTCACACTT 1200 

TTACATGAAA CAT6ATTAAG TTCTTTAAAC ATGTTGATTA AAGGGTGACA GTTTGTATTT 1260 

TTTAATCAAG TAATCTAGAA CTTAAACTAT GGTAATAATA TAAAATGAAT ATGAAACTAA 1320 

TATATTCTGA TGGAACAGAA GAAAGCAATA TCAAGAGAGA CAAAACACAC ACTTTGATGA 1380 

GCTCTATCTT TTAAACAAAA AATGGAATTG AAAGACCAAA TAAAATAGGC ATTAGCCCAT 1440 

ATCATAAAAT CTTTTGTAAA ATATTAATAG AAAGTAAATG AACACTATAT ATGATGCATA 1500 

CGTAGAAAAT GTAAAAGGAT TTTTGAGATA ATATCTTTTG ATGTTGAATG TGAATGCAGG 1560 

CAACAGCAGG TTTGAGGCTC TTGGATGGGG ATGCTGCTGA AAAGATATTG CAAGCGG TAA 1620 

CCATGAGCTT AGTTCATTTC CTTATGTTAT TAACTACGCT TTCAATGTCT TAACTTTCGT 1680 
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ATTTTCAATA 


AAAAAAAAAA 


AAAAAGAAGT 


2880 


XCAGTAAACT 


TCATATCTGC 


ATTATGTTTA 


TTTGAATAGT 


AAAACACTAT 


AAAATATATC 


2940 


X xon X U A xuvivj 


GATAAACATG 


CAGAGTAGTA 


AAAAACTTAT 


TTAGAATATA 


GTCATTTAAT 


3O00 


TTTTCTTATG 


ATATATCTTG 


GGAATTTTGT 


GTAGGTTACA 


GTTAACTATC 


TGTTAGGAAA 


3060 


GTTGGGAAAG 


AAGTTTACAA 


AAACTGTGGG 


AGTGATAGAT 


CTTGGAGGTG 


CTTCAGTTCA 


3120 


AATGGCTTAT 


GCTGTCTCAA 


GAAATACAGC 


TAAAAAT6CC 


CCAAAACCAC 


CACAAGGAGA 


3180 


GGATCCATAC 


ATGAAGAA6C 


TTGTACTCAA 


G6GAAA6AAA 


TATGACCTTT 


ATGTTCACAG 


3240 


GTTACTTTCT 


GTTATCATTC 


ATATAGCAAA 


GGAACAATTA 


TCATTTCAAT 


TTCTAAAATA 


3300 


TATTTATAAT 


CTCTAAAATC 


AAATAACATA 


AAAAAATGGT 


AATATAATGT 


TGCGTTTTGG 


3360 


GATTGTTTGG 


ATTAAAGGGT 


AAATTTGAAG 


AAGAAAAAAA 


ATAATAAATA 


AAGAAAAAGA 


3420 
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GAAAAAAAAT 


AAGATTGTTT 


GGATTATTAG 


AAAGAGAAAA 


AGTTGAATAA 


TTATTTTTAT 


3480 


ATTTTAATAT 


TATTTTAATT 


ATTTATTATT 


ATGAAAATAA 


AATATTTATT 


TTTAAATTTA 


3540 


TATTTTATTA 


TTATTTTTTA 


ATTTTATTAT 


TATAAAAATA 


TAAATATTAT 


TAATAATTAT 


3600 


TATTTTAATT 


TTATTTATTA 


ATATAATATA 


ATAATAAATA 


AAATATTAAT 


ATTTTATGTT 


3660 


ATATTATATA 


ATATTTAATT 


ATACATATGT 


ATTTTTTTTC 


TGCAAATTTT 


TACCTTTTAA 


3720 


GCGGAGAAGA 


TGAAGGGCAT 


AAATTGTTCT 


CGAAATTAGT 


TATATTTTGT 


TCAATTTTAA 


3780 


CAAAATCATC 


TCAAATCAGT 


CTTCATAAAT 


AGTATTTATG 


TAGATCCAAA 


TAGAGGCTTA 


3840 


ACGTGGTCTA 


GTTGTACAAA 


CCTAAAAGGT 


GTTTCTTTTT 


TTCTTTAATT 


TGAAGAACTA 


3900 


GAATATTCj i J. 


J. X XLpAA.1 11^ 






21 21 21 'I "I "1*21^2 
l^wvvrlX X XAw 




3960 


AACTTGGTTA 


ACTTTATAAC 


GAATGTCAGA 


AAAAATGGTA 


GGTATGTTAT 


AAATACTTCT 


4020 


GATATCAAAA 


TGGCAAAAAC 


TCCAGAGTCT 


CACTTCCAAG 


AATCATCACT 


TTTTCTCACC 


4080 


TTAATCTGAA 


ATAATGAATG 


CTTACTTTTT 


TTAAGATATT 


TATAGATATC 


TATAATCCAT 


4140 


TGAAGTTCAG 


TGTAGTGTAA 


ATAAATTATA 


ATGTAAAAAC 


CTATACACTG 


AGTACAGATC 


4200 


CATGTGTAGT 


TACTTTTTTA 


TGGTTTAACT 


GATAAATATG 


CATGAGTCAT 


GTCATGGCTA 


4260 


ACGTACAGGT 


CTTAATCAAC 


TTCTTTGTTG 


CAGTTACTTG 


C6TTATGGTA 


A rT" 21 r'n^' 21 fir* 


ffc J Z U 


ACGTGTTAAG 


ATTTTTAAGA 


CCACTGATGG 


X X X X 


CCTTGCCTAT 


Tn*^ r* 21 r*r'r"r 21 

X tTivV* AUijL* J. A 


'i J OU 


TGAAGGXAAA 


TAAAGTATTC 


TTTTGTACAA 


AL. C XiiA X X 


XA^X X X^X X/i 


X X XvsCAX 1 




CAGAATAGTG 


CAAAGGACTG AAACTAGAAA 


fV^ TV TT*/^/^ A TV T 

LtLsAX iv-uAAl 


TT* A r«T A r* A 71 
X UAu X AL-MAva 


21 71 r* 21 21 21 21 21 21 21 




AGTAGTGATT 


TAGTGACCAA AGTTACTTTT 


TCCTCACTGA 


GTTCTATTGA 


AATGCAGAAA 


4560 


CTTGTTGCAG 


ATATTTTAAA 


TACATATTAA 


GTGTTTTGTC 


AGTACTGCAT 


TTGTTTTTAG 


4620 




TCGAGTTTTT 


TCTTGAAGCA 


TTAAAGCTGC 


AAATAACATG 


TGGGTCTTTT 


4680 


TTCTATCTTT 


AAAGATATAT 


ACAGATATTC 


CGGAGAATCG 


TACAATATCT 


ATGGTCCCAC 


4740 


TTCTGGTGCC 


AACTTTAATG 


AGTGCCGTGA 


CCTA6CTCTT 


CAQATTCTCA 


6ATT6AATGA 


4800 


GCCATGTTCC 


CATGAAAACT 


GCACCTTTGG 


TGGGATATGG 


6AT6GTGGAA 


AAGGAAGT66 


4860 


ACA6AAAAAC 


CTT6TTGTTA 


CTTCAGCTTT 


CTACTATAGG 


TCTTCTGAGG 


TATCCATTCT 


4920 


CTGTTAATTT 


CTTGTTTACT 


TTGATTACTT 


ATTTGTTTTT 


ATACCAATAA 


ATTTTACATT 


4980 


ATAGTTTATA 


CTGTGCTAAT 


TTTGTTGTTT 


TTAGGTTGGT 


TTTGTCACTC 


CTCCCAATTC 


5040 


CAAAAATCGC 


CCTCTGGATT 


TTGAAACTGC 


AGCTAAACAA 


GCTTGTAGTT 


TAACATTCGA 


5100 
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GGAA6CGAAA 


TCCACTTTTC 


CAAATGTTGA 


GAAAGATAAA 


CTTCCATTTG 


TATGCGTGGA 


5160 


TTTCACATJLC 


CAGTATACAT 


X w W X X \3 X X UXL 


XUU;aX X X vnj X 


4-VXVjX X X XV—aX 


a aTfaa'PTaf^ 




CAAGTTGATA 


TTTAACTTCT 


TCCAAAAAAC 


XnXVjX X X XV^X 


X X XVJtXWXXv^V^ 


a a r* a r*TG a r*T^ 




CCTAATTCAA 


CTTTTGGCAG 


6CCTAGATCC 


AGAGCAAGAG 


ATTACAGrGG 


CAGAAGG7LAT 


5340 


TGAATATCAA 


GAT6CCATT6 


TGQAAACAGC 


AX X ^ Xn, 


GGAAffTGC'CA 

OwHaW X VJWV« A 


TaGaaGPfAT 

X A^JfcCfctfWV. aX 


Z>*±\J \f 


ATCATC7TTT(t 


CCTAAATTTA 


ATCGTCTAAT 


\3XaX X X XAX^ 


Ta a GPr*aTGT 

X AAU\<r%MAX W X 


f PT f a a 

WW X W WaW X X A 






AATTAAAATA 


AAAnTPAPCTf 


X X X X LAW X Au 


TCCTTTTTTA 


X XWWAX XwAv 






XaMX X Xl3X X X 


a r» a 31 a Tfi 

^ X \7 A^^Ann X u 


iMilnv X X AAA 


ir«*pr»aiv ar^a a 

Aur X VarAAAUAA 


AGTATGTTTT 


c c Q rt 




\uUi.X WMinUX 




X iaAuU X (jC (. A 


ATTAACTAAT 


ACTCTGACTT 










TV TV TV TV "TV T'/^IVPTV 

AAAAATCAxA 


TATGTAATCG 


GGAAAATTTG 


5700 




T*T\ TV TV A A T\ T\ TV^ 


Ar^OT\?VTVT'T\7\ 7\ 

AL. CJiAAl AAA 


TTTTCCTAAA 


TTCCTCTGCA 


ACATATACAA 


5760 






TTTCTTTAAT 


GGAATAAGTA 


CTTTTTGAAA 


AACTATCATA 


5820 


TTAGTAAACT 


TATcrrriTC 


ATCTAACAGG 


CAGCAAAATT 


AATTGCATGA 


ACGGATCCAA 


5880 


TTAATTCTCT 


CGTACAGCTC 


CAGATAAGAA 


GCGTTTAATG 


AGATAAATTG 


TTGGATAATA 


5940 


TATGTTGGGT 


GTGGGTGGAT 


TATGATACTA 


TCGATAATAA 


ATTTGGAATC 


TAATTAAATT 


6000 


TTATAAAATT 


AATTTATCAA 


TATATAATAT 


TTTATATATA 


TTAATTTGAT 


AATATTTTTA 


6060 


ATAATTTTAT 


ATTTTTAATA 


TTTAATTTTA 


ATTTAAGQAA 


ATTTTTAAGA 


TAATTAATTT 


6120 


TTTATTTTTA 


TTTTTTTGTA 


TAGTACTCAG 


GACATAATAA 


TGTTATTAAT 


TTAAATAAGA 


6180 


CTTAAATATA 


TATTTTTCTT 


ATAATGCTTA 


AATCTCAGTC 


TTATTATTGC 


TATCACATAA 


6240 


TGACACGAAC 


TAACTAGCTT 


CACTC 








6300 
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SEQ ID NO:4 5' Degenerate primer sequence 
TA(T/C)GCNGTISrAT(T/C)TT(T/C)GATCG 

SEQ ID NO:5 3' Degenerate primer sequence 

AT(A/G)TT(A/G)TA(T/A/G)AT(G/A)CCNGG 

SEQ ID NO:6 3' RACE primer 
CGTCCGATACTTCTATA 

SEQ ID NO:7 5' RACE primer 
AACTTAGATCTCCTGCAC 



SUBSTITUTE SHEET (RULE 26) 



wo 99/07223 



PCT/US98/16261 



SEQ ID NOS: 8-9 



11 

CAA ATT AA3 


AAC 


ATG 


20 
GAG 


TIC 


29 

CTA AIT ACA CTC 


38 
ATT GCC 


ACT 


47 

TTT 


TTA 


CTC 


56 
TTG 


Q I 


K 


IX 


M 


£ 


F 


L 


I 


T 


L 


I 


A 


T 


F 


L 


L 


L 


TTA ATG 


65 
CCT 


GCA 


ATC 


74 
ACT 


TCC TCC 


83 92 
CAA TAT TTA GGA AAC 


AAC 


101 
CTA 


CTC 


ACT 


110 
AAT 


L M 


P 


A 


I 


T 


S 


5 


Q 


Y 


I. 


G 


N 


N 


L 


L 


T 




CGA AAG 


119 

ATT 


TTC 


128 137 
CAA AAA CAA GAA ACC 


TTA ACC 


146 
TCT 


TAC 


GCT 


155 
GTC 


ATA 


TTT 


164 

GAT 


R K 


I 


F 


Q 


K 


Q 


E 


T 


L 


T 


S 


■ Y 


A 


V 


I 


p 


D 


GCT GGT 


173 
AGC 


ACT 


GGT 


182 
ACT 


CGT 


GTC 


191 

CAT GTT 


TAC 


200 
CAT 


TTT 


GAT 


209 
CAG 


AAC 


218 
TTA GAT 


A G 


S 


T 


G 


T 


R 


V 


H 


V 


Y 


-H 


F 


D 


<2 


N 


L 


D 


CEA CTT 


227 
CAC 


ATT 


GGC 


236 
AAT 


GAT 


ATT 


245 
GAG 


TTT 


GTT 


254 
GAC 


AAG 


ATC 


263 
AAA 


CCA 


GGT 


272 
TTG 


L L 


u 


I 


G 


N 


D 


I 


E 


F 


V 


D 


K 


I 


K 


p 


G 


L 


281 290 299 308 317 326 
AST GCA TAT GGG GAT AAT CXTT GAA CA^^ GCA GCA AA;;. TCT CTC ATT CCA CTT TTC 


S A 


Y 


G 


D 


N 


P 


E 


Q 


A 


A 


K 


S 


L 


I 


P 


L 


L 


GAG GAA 


335 
GCA 


GAA 


GAT 


344 

GTG 


GTT 


CCT 


353 
GAG 


GAT 


CTG 


362 
CAC 


CCC 


AAA 


371 
ACA 


CCC 


CTT 


380 
AGG 


E E 


A 


E 


D 


V 


V 


P 


E 


D 


I. 


H 


p 


K 


T 


p 


L 


R 


CT? GGG 


389 
GCA 


ACC 


GCA 


398 
GGT 


TTG AGG 


407 
CTT 


TTG 


AAT 


416 

GGG 


GAT 


GCT 


425 

GCT 


GAA 


AAG 


434 
ATA 


L G 


A 


T 


A 


G 


L 


R 


L 


L 


N 


G 


D 


A 


A 


H 


K 


I 


TTG CAA 


443 

GCG 


ACA AGG 


452 
AAT 


ATG 


TTC 


461 

AGC 


AAC 


AGA 


470 

AGT 


ACC 


CTC 


479 
AAC 


GTT 


CAA 


488 
CGT 


L Q 


A 


T 


R 


NT 


M 


F 


S 


N 


R 


S 


T 


L 


N 


V 


Q 


R 


GAT GCA 


497 
GTT 


TCT 


ATT 


506 
ATT 


GAT 


GGA 


515 
ACC 


CAA 


GAA 


524 

GGT 


TCT 


TAT 


533 

ATG 


TGG 


GTDG 


542 
ACA 


D A 


V 


S 


I 


I 


D 


C 


T 


Q 


E 


G 


S 


Y 


M 


W 


V 


T 


GTT AAC 


551 
TAT 


GTA TTG 


560 
GGG 


AAT 


TTG 


569 

GGA AAA 


AGC 


578 
TTC 


ACA 


AAA 


587 
TCA 


GTG 


GGA 


596 
GTA 


V N 


y 


V 


h 


G 


N 


L 


G 


K 


S 


V 


T 


K 


S 


V 


G 


V: 


ATT GAC 


605 

CTT 


GGA GGT 


614 
GGT 


TCA GTT 


623 

CAA ATG 


ACA 


632 

TAT GCA 


GTG 


641 
TCA 


AAG 


AAA 


650 
ACA 


I D 


L 


G 


G 


G 


S 


V 


Q 


M 


T 


Y 


A 


V 


S 


K 


K 


T 
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I 



559 668 677 685 695 
GCA AAA AAT GCT CCT AAA GTT GCT GAT GGA GAG GAT CCA TAT ATT AAG 


704 
AAG CTT 


A 


K N 


A 


P K V 


A D G E D 


P Y I 


K 


K L 


GTG 


713 
CTC AAG 


GGA 


722 
AAG CAA TAT 


731 740 749 
GAT CTC TAT GTT CAT AGT TAG TTG 


CGT 


758 
TTT GGC 


V 


L K 


G 


K Q Y 


D L Y V H 


S Y L 


' R 


F G 


767 

AAA GAA GCA ACT 


776 
CGA GCA CAG 


785 794 
GTT TTG AAT GCA ACT 


803 

AAT GGA TCT 


GCT 


812 
AAC CCT 


K 


E A 


T 


R A Q 


V L N A T 


N G S 


A 


N P 


821 

TGC ATT TTA 


CCT 


830 
GGA TTT AAT 


839 846 857 
GGG ACC TTT ACA TAT TCA GGA GTG 


GAG 


866 
TAT AAG 


C 


I L 


P 


G F N 


G T F T Y 


S G V 


E 


Y K 


GCT 


875 
TTT TCC 


CCT 


884 

TCT TCT GGC 


893 .902 911 
TCC AAC TTT GAT GAT TGC AAA aAA 


ATA 


920 
ATT CTT 


A 


F S 


P 


S S G 


S N F D D 


C K E 


I 


I L 


AAG 


929 
GTT CTT 


AAA 


938 

GTA AAT GAT 


947 956 
CCA TGT CCC TAT CCG 


965 

AGT TGC ACT 


TTT 


974 

GGT GGA 


K 


V L 


K 


V N D 


P C P Y P 


S C T 


F 


G G 


983 992 1001 1010 1019 
ATA TGG AAT GGT GGA GGA GGG AGT GGA CAA AAA AAA CTT TTT GTT ACT 


1028 
TCA GCT 


I 


W N 


G 


GGG 


S G Q K K 


L F V 


T. 


S A 


TTC 


1037 
GCT TAG 


CTG 


1046 
GCT GAA GAT 


1055 1064 
GTT GGT ATG GTT GAG 


1073 
CCA AAT AAA 


CCT 


1082 
AAT TCC 




A Y 


h 


A E D 


V G M V E 


P N K 


P 


N S 


ATA 


1091 
CTT CAT 


CCA 


1100 
GTA GAT TPC 


1109 1118 
GAA ATT GAA GCT AAG 


1127 
CGA GCT TGT 


GCA 


1136 
TTA AAC 


I 


L H 


P 


V D F 


E I E A K 


R A C 


A 


L N 


TTT 


1145 
GAG GAT 


GTC 


1154 
AAA TCC ACT 


1163 1172 
TAT CCT CGA CTT ACG 


1181 - 
GAT GCA AAA CGT 


1190 
CCA TAT 


F 


E D 


V 


K S T 


Y P R L T 


DAK 


R 


P Y 


GTA 


1199 
TGC ATG 


GAT 


1208 
CTC TTA TAG 


1217 1226 
CAA CAT GTG TTG CTT 


1235 
GTT CAT GGA 


TTT 


1244 
3GC TTA 


V 


C M 


D 


L L Y 


Q H V I, L 


V H G 


F 


G L 


GGT 


1253 
CCA CGA AAA 


1262 
GAG ATT ACA 


1271 1280 
GTA GGT GAG GGA ATT 


1289 
CAA TAT CAG 


AAT 


1298 
rCT GTT 


G 


P R 


K 


E I T 


V G E G I 


Q Y Q 


N 


S V 


GTG 


1307 
GAA GCT 


GCA 


1316 
TGG CCT CTA 


1325 1334 
GGT ACT GCC GTG GA^ 


1343 

GCC ATA TCA GCG 


1352 
ETA CCT 


V 


E A 


A 


W P L 


G T A V E 


A I S 


A 


L P 
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1361 1370 1379 1388 1397 1406 

AAG TTT AAG CGA TTA ATG TAT TTT ATT TAA GCT TTT AGA GAT GTC AAG ATA TTT 



KF K R- LMYF I * AFRDVK I F 

1415 1424 1433 1442 1451 1460 

GAG TAA CAG CTA ACT TTA TCA AAA ATT AAA TAA AAC TGG CGC ATT TTG TCT TTC 3 ' 



Q» QLTLSKIK*NWRILSF 
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SEQ IDNOS: 10-11 



9 18 27 36 45 54 

• AAG TCC TCT TCT CTC TGT AGT TAG TTG CAT TGG ACT AAA GCC ATC GAC TTC TTA 



K 


C 


5 


S 


L 


C 


5 




L 


H 


W T 


K A 


M 


D 


F 


L 


ATT AGT 


63 
CTC 


ATG 


ACC 


72 
TTT 


GTG 


TTC 


81 
ATG 


90 

TTA ATG CCT GCT ATC 


99 

TCT 


TCC 


TCC 


108 
CAA 


I 


S 


L 


M 


T 


F 


V 


F 


M 


L. 


M P 


A I 


S 


S 


S 


Q 


TAT 


CTC 


117 
GGA 


AAC 


AAC 


126 
ATT 


CTC 


135 
ATG AAT 


144 

CGT AAG ATA TEA CTC 


153 
CCC 


AAA 


AAT 


162 
CAG 


Y 


L 


G 


N 


N 


I 


L 


M 


K 


R 


K I 




P 


K 


N 


Q 


GAA 


CCA 


171 

GTT 


ACA 


TCA 


130 

TAC 


GCT 


189 
GTT ATA 


TTT 


198 

GAT GCT 


GGT AGC 


207 
ACT 


GGA 


AGC 


216 
AGA 




P 


V 


T 


s 


Y 


A 


V 


I 


F 


D A 


G S 


T 


G 


S 


R 


GTC 


CAT 


225 
GTC 


TAC 


AAT 


234 
TTT 


GAT 


CAG 


243 
AAC 


TTA 


252 
GAT CTC 


CTT CCC 


261 
GTT 


GAA 


AAC 


270 
GAA 


V 


H 


V 


Y 


N 


F 


D 


Q 


N 




D L 


L P 


V 


E 


N 


E 


CTT 


GAG 


TTT 


TAT 


GAT 


288 
TCG 


GTT 


AAA 


297 

CCC 


306 

GGT TIG AGT 


TCA TAC 


315 
GCT 


GGT 


AAT 


324 
CCT 


L 


E 




Y 




s 


V 


K 


p 


G 


L S 


S Y 


A 


A 


N 


P 


GAA 


GAA 


333 
GCT 


GCA 


GAA 


342 

TCT 


CTG 


ATT 


351 
CCA 


CTT 


360 

CT?^. AAA 


GAA GCA 


369 
GAA 


AAT 


GTG 


378 
GTT 


E 


E 


A 


A 


E 


S 


L. 


I 


P 


L 


L K 


E A 


E 


N 


V 


V 


CCT 


GTG 


387 
AGC 


CAG 


CAA 


396 

CCC 


AAC 


405 
ACA CCC 


GTT 


414 

AAG CTT 


GGG GCA 


423 

ACT 


GCA 


GGT 


432 
TTA 


p 


V 


s 


Q 


Q 


p 


N 


.T 


-P 


V 


K L 


G A 


T 


A 


G 


L 


AGG 


CTT 


441 
TTG 


GAG 


GGG 


450 
AAT 


GCT 


459 468 

GCT GAA AAT ATA TTG 


CAA GCG 


477 

GTC 


AGG 


GAT 


486 
ATG 


R 


L 


L 


E 


G 


N 


A 


A 


E 


N 


I L 


Q A 


V 


"R 


D 


M 


CTC 


AGC 


*■ 7 J 
AAC 


AGA 


AGT 


504 
GCC 


CTT 


AAT 


513 
GTT 


522 531 
CAA TCA GAT GCA GTA TCT 


ATT 


CTT 


540 
GAT 


L 


S 


N 


R 


S 


A 


L 


N 


V 


Q 


S D 


A V 


s 


r 


L 


D 


G3A ACC 


549 
CAA 


GAA 


GGT 


558 
TCT 


TAT 


CTT 


567 
TGG 


GTG 


576 
ACA ATT 


AAC TAT 


585 
CTC 


TTG 


GGG 


594 

AAG 


G 


T 


Q 




G 




Y 


L 


W- 


V 


T I 


N Y 


L 


L 


G 


K 


TTG 


GGA 


603 
AAA 


AGA 


TIT 


612 
ACA 


AAG 


621 630 639 648 
ACA GTG GGA GTA Gl*r GAT CTA GGA GGT GGG TCA 


L 


G 


K 


R 


p 


T 


K 


T 


V 


G 


V V 


D L 


. G 


G 


G 


S 
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657 666 675 634 693 

GTG CAA ATG ACA TAT GC\ GTC TCA AGG AAC ACA OCT AAA AAT GCT CCA AAA 

vqmtyavsrntaxnap K 

711 720 729 738 747 

CCr GAA GGA G;y3 CAT CCA TAC ATA AAG AAG CIT GTA CTC CAG GGA AAG AAA 

PEGEDPVIKKLVLQ G. KK 

765 774 783 792 801 £ 

GAC err TAT GTT CAC AGT TAC TTG CGC TAT GGA AGA GAA GCA TIT CGT GCA G 

DL'YVH SYLRYGREAFRAl 

819 828 337 846 855 86 

ATT TIC AAG GTC GCT GGT GGT TCT GCT AAT CCT TGC ATT TTA GCT GGC TIT GA 

XFKVAGGSANPCILAG FD 

873 882 391 900 909 918 

GGG GCA. TAT JVCA TAT TCC GGA GCA GAG TAT AAG GTC TCG GCC CCA GCT TCA GGA 

CAY TY S GAEYKVSAP A S G 

027 93S 945 954 963 972 

TCT AAC TTG AAT CAA TGC AGA AAG ATA GCT CTT AAG GCT CIT AAA GTG AAT GCA 

's"m ^ N Q C R K I A L2CA L K.V ^ A 

gai 990 999 1008 1017 1026 

CCT T3T CCC TAT CAG AAT TGC ACT TTT GGT GGG AXA TGG AAT GGT GGA GGT GGA 

"pCP YQNCTFGGIWNG-G GG 

1035 1044 10S3 1062 1071 1080 

ACT CGT CAA AAA AAT CIT TTC CTT ACT TCA TCT TIC TAT TAC CTC TCT GAA GAT 

"s'GQKtrt.Fl.TSSFVYL S HD 

1089 1098 . 1107 1116 1125 1134 

GTT GGG ATC TTT GTG AAT AAA CCC AAT GCC AA\ ATT CGT CCA GTT GAT TTG AAG 

"gi FVNKPNAKIRP'^D L K 

1143 1152 1161 1170 1179 1133 

ACT GCA GCT AAA! CTA GCT TGT AAA ACA AAT CTT GAG GAT GCA AAA TCC AAA TAC 

"taaklacktnlhdaks KY 

1197 1206 1215 1224 1233 1242 

CCA GAT CTT TAT GAG AAA GAC AGT GTT GAA TAT GTG TGC TTG GAT CTT GTC TAC 

y E K D SVEYVC L D L V Y 

1251 1260 1269 U78 1287 1296 

gtg tac aca ttg ctt gtt gat gga ttt ggt ctt gat cca ttt caa gag gtt aca 
"^'~v'tllvdgfgldpfqevt 

1305 1314 1323 1332 1341 1350 



GTG GCG 



AAT GAA ATT GAA TAT CAG GAT GCT CTT GTG GAA GCC GCA TGG CCT CTA 

'^ CTEIEYQDALVEAAW P L 
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1359 1368 1377 1386 1395 1404 

GGC ACT GCC ATA GAA GCA ATA TCA TCA TTG CCT AAA TTT GAG AGA TTA ATC TAT 

GTAIEAI SSLPKFERLm'y" 

1413 1422 1431 1440 1449 1458 

TIT ATT TAA ACT ACT AGT ACC TGC TTA AGC CTG GAT TAG CTG AAG AAA TAA AAT 

*TT STCLSLDyLKK*N 

1467 1476 1485 

GAA ATA AAA GCC GCA TCT TIC TTC CTT GCT T 3* 

EIKAAScFLA 
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sol ccrrrTAAAccr 
2fi7 V c K V 



SEQ ID NOS: 12-13 
DBXtep 

I ATg^CCATCCA I : I iC T C XTAATCCTA X ^ * «>UI ' C. It, L XL ;t, TCCACSCTSCTTCCXKCrcCXACrCC>J^rr^^ BO 
1 H3MDrL.T r LrSLLtWTLYAT^ ATATA5S 27 

«X CI i * XCC^iCC ATCCGAACCCCTTCAACCATCCCAAi ;;! i : .- CC i \- C ICC CATAATXXTTATTCSATTCAACAAACCATTA I60 

Ifil ATGAATCrrATCCACTTATCrrCCArGCTCGTACXACACCAACr^ 240 
♦ oaxs DBX7-lbi/fev _ 

CI : CC CATTCCCCATG^CCrrCAC t, i X ' UO rU UUJACSAACCCACCrrrAACrCCAXACgCrGACAXTCCAGAACA 
'll *Ll.RXCilDJ:ELrVKTB:rCLSArABHPCe 

J2i ACc:ccc;u;AA. cic : r Grcc cx c x x . r :t:;;^A ccAACcACAACcTCTTarrccrc^^ *a 

IQ7 AASSl-V?LLECA£AVt?a&tH?RTPV 
401 AAfrrrCCACCAACCCCXCKrrrrAACCCAATTCC^^ 

X34ICv"cATACLRQLCCDXSK!iri. OAVS OKL 
481 AACa^AGXCAACCACATTCAACCTTCA<W»:CATGCAC^^ 

l«t KKXS TLKVrCOAvaVLSCMQoCAYOW V 
<61 GACrATrAACTArrrACr<XX:AAACTTGCCAAAGCAI^ 

157 T tHtLLCHLGKHrSlCTVAVV OLCCCS 

OBXI-fodVav ^ DgX2-foffrTyv ^ 

641 T-XAAATCCCTTACCCAATCTCACACGAACATCCTCCTAAAC^^ "'^O 
21lv"oHAYA lSCEDAAKAPOVr3CVBS Tir 

0SX10 ^ 

72" CACaXCrrCCrCACCCCauUiCUAATATTACXTCTATGXAC^^ 

24-IeMF:.RCKK r YLYVH3rl.RTCI.LAARAE 

0BX6 



320 
107 

400 
134 

480 
1£C 

560 
157 



£40 
21* 



240 

BQC 
2S7 

880 
294 

9SC 
320 



040 
347 



oaxs ^ 5552 

ACTA^AAACCCACACC CLLC C C Ji rA GCCTCAXtCTrCACCAAArCCCAAAA ro i> XUi XCT r GAACCTCTCCAj,wTCAAr 
2J4a't'KATAPPSCSSrSKCQKVVLCALHVM 

DaX4 _ 

53ScArCCrCrrATAA(»JArrCCACTTTCCGACC^ 

321 A r c's vKDcrrcc, rwwGGccAGEKHrrv 

08X11 ^ 

-01- -CCATCA XI i i U , I X CXACTgCX:CCATGAdbCTGCTTrrcrrCATCCAAXCCATCCC^ ^-20 

'347 'a5 rrrsvA D.e'AcrvDp^ioANX t v r pv 374 

OgX9-fo r/TPV 

•121 XT--WAJU:ATCCACCAAACCTTGCTTCrACCACACyLXTTAXAC^ 
374 o'fEDAAKVACSTELJCOLXJVri'RVWOG 400 

0BX12 

1201 CATSTTCCTTACATATSTTTCCATCrrACTATACCAATATACATTCCT 
,03 0V?Tl.CT.Dt.-VY0TrLLVocrCXorQQE 427 

•23- cArTACATTCCTCACCCAAArrCACTATCACCATTCTCrrCCTC^ 

*42r T TLVRQlOYaDSI.VEAAWpl.CSArEA 



DBXbottom 

1434 



1361 TArcrrCCTTACCTXXATTTCAC^AATTAATCTATTTCCr^ 
pi^jrwS . DBX cDN A .nd d^uc^ .mfno .cfd ,«;uenc«. DdX primcn *fc cSoooUd by ^^'^^^J; ^Tf^^"^ 
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